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Full-length human cDNAs encoding potentially secreted proteins 

Related application 

The present application claims priority to the US Provisional Patent Applications Serial Nos 
60/169,629 and 60/187,470 filed December, 8, 1999, and March, 6, 2000, respectively, the 
5 disclosures of which are incorporated herein by reference in their entireties. 

Field of the invention 

The present invention is directed to polynucleotides encoding GENSET polypeptides, 
fragments thereof, and the regulatory regions located in the 5'- and 3 '-ends of the GENSET genes. 
The invention also concerns polypeptides encoded by the GENSET polynucleotides and fragments 

10 thereof. The present invention also relates to recombinant vectors, which include the 

polynucleotides of the present invention, particularly recombinant vectors comprising a GENSET 
regulatory region or a sequence encoding a GENSET polypeptide, and to host cells containing the 
polynucleotides of the invention, as well as to methods of making such vectors and host cells. The 
present invention further relates to the use of these recombinant vectors and host cells in the 

15 production of the polypeptides of the invention. The invention further relates to antibodies that 
specifically bind to the polypeptides of the invention and to methods for producing such antibodies 
and fragments thereof. The invention also provides for methods of detecting the presence of the 
polynucleotides and polypeptides of the present invention in a sample, methods of diagnosis and 
screening of abnormal GENSET gene expression and/or biological activity, methods of screening 

20 compounds for their ability to modulate the activity or expression of GENSET genes and uses of 
such compounds. 

Background of the invention 

The estimated 50,000-100,000 genes scattered along the human chromosomes offer 
tremendous promise for the understanding, diagnosis, and treatment of human diseases. In addition, 
25 probes capable of specifically hybridizing to loci distributed throughout the human genome find 
applications in the construction of high resolution chromosome maps and in the identification of 
individuals. 

Currently, two different approaches are being pursued for identifying and characterizing the 
genes distributed along the human genome. In one approach, large fragments of genomic DNA are 
30 isolated, cloned, and sequenced. Potential open reading frames in these genomic sequences are 
identified using bio-informatics software. However, this approach entails sequencing large 
stretches of human DNA which do not encode proteins in order to find the protein encoding 
sequences scattered throughout the genome. In addition to requiring extensive sequencing, the bio- 

1 
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* informatics software may mischaracterize the genomic sequences obtained, i.e., labeling non-coding 

DNA as coding DNA and vice versa. 

An alternative approach takes a more direct route to identifying and characterizing human 
genes. In this approach, complementary DNAs (cDNAs) are synthesized from isolated messenger 
5 RNAs (mRNAs) which encode human proteins. Using this approach, sequencing is only performed 
on DNA which is derived from protein coding fragments of the genome. In the past, these cDNAs, 
ofter short EST sequences were obtained from oligo-dT primed cDNA libraries. Accordingly, they 
mainly corresponded to the 3' untranslated region of the mRNA. In part, the prevalence of EST 
sequences derived from the 3' end of the mRNA is a result of the fact that typical techniques for 

10 obtaining cDNAs, are not well suited for isolating cDNA sequences derived from the 5' ends of 
mRNAs (Adams et al, Nature 377:3-174, 1996, Hillier et al t Genome Res. 6:807-828, 1996). In 
addition, in those reported instances where longer cDNA sequences have been obtained, the 
reported sequences typically correspond to coding sequences and do not include the full 5' 
untranslated region (5'UTR) of the mRNA from which the cDNA is derived. Indeed, 5'UTRs have 

15 been shown to affect either the stability or translation of mRNAs. Thus, regulation of gene 

expression may be achieved through the use of alternative 5'UTRs as shown, for instance, for the 
translation of the tissue inhibitor of metalloprotease mRNA in mitogenically activated cells 
(Waterhouse et al, J Biol Chem. 265:5585-9. 1990). Furthermore, modification of 5'UTR through 
mutation, insertion or translocation events may even be implied in pathogenesis. For instance, the 

20 fragile X syndrome, the most common cause of inherited mental retardation, is partly due to an 
insertion of multiple CGG trinucleotides in the 5'UTR of the fragile X mRNA resulting in the 
inhibition of protein synthesis via ribosome stalling (Feng et al., Science 268:73 1-4, 1995). An 
aberrant mutation in regions of the 5'UTR known to inhibit translation of the proto-oncogene c-myc 
was shown to result in upregulation of c-myc protein levels in cells derived from patients with 

25 multiple myelomas (Willis et aL, Curr Top Microbiol Immunol 224:269-76, 1997). In addition, the 
use of oligo-dT primed cDNA libraries does not allow the isolation of complete 5'UTRs since such 
incomplete sequences obtained by this process may not include the first exon of the mRNA, 
particularly in situations where the first exon is short. Furthermore, they may not include some 
exons, often short ones, which are located upstream of splicing sites. Thus, there is a need to obtain 

30 sequences derived from the 5' ends of mRNAs. 

Moreover, despite the great amount of EST data that large-scale sequencing projects have 
yielded (Adams et al., Nature 311:174, 1996, Hillier etal., GenomeRes. 6:807-828, 1996), 
information concerning the biological function of the mRNAs corresponding to such obtained 
cDNAs has revealed to be limited. Indeed, whereas the knowledge of the complete coding 

35 sequence is absolutely necessary to investigate the biological function of mRNAs, ESTs yield only 
partial coding sequences. So far, large-scale full-length cDNA cloning has been achieved only with 
limited success because of the poor efficiency of methods for constructing full-length cDNA 
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libraries. Indeed, such methods require either a large amount of mRNA (Ederly et al. y 1995), thus 
resulting in non representative full-length libraries when small amounts of tissue are available or 
require PCR amplification (Maruyama et al. 9 1994; CLONTECHniques, 1996) to obtain a 
reasonable number of clones, thus yielding strongly biased cDNA libraries where rare and long 
5 cDNAs are lost. Thus, there is a need to obtain full-length cDNAs, i.e. cDNAs containing the full 
coding sequence of their corresponding mRNAs. The present application presents a number of 
cDNAs, called GENSET polynucleotides, isolated from full-length cDNA librairies obtained from 
the methods described in PCT publication WO 00/37491. 

While many sequences derived from human chromosomes have practical applications, 

10 approaches based on the identification and characterization of those chromosomal sequences which 
encode a protein product are particularly relevant to diagnostic and therapeutic uses. Of the 50,000- 
100,000 protein coding genes, those genes encoding proteins which are secreted from the cell in 
which they are synthesized, as well as the secreted proteins themselves, are particularly valuable as 
potential therapeutic agents. Such proteins are often involved in cell to cell communication and 

1 5 may be responsible for producing a clinically relevant response in their target cells. In fact, several 
secretory proteins, including tissue plasminogen activator, G-CSF, GM-CSF, erythropoietin, human 
growth hormone, insulin, interferon-ct, interferon-p, interferon-y, and interleukin-2, are currently in 
clinical use. These proteins are used to treat a wide range of conditions, including acute myocardial 
infarction, acute ischemic stroke, anemia, diabetes, growth hormone deficiency, hepatitis, kidney 

20 carcinoma, chemotherapy induced neutropenia and multiple sclerosis. For these reasons, cDNAs 
encoding secreted proteins or fragments thereof represent a particularly valuable source of 
therapeutic agents. Thus, there is a need for the identification and characterization of secreted 
proteins and the nucleic acids encoding them. 

In addition to being therapeutically useful themselves, secretory proteins include short 

25 peptides, called signal peptides, at their amino termini which direct their secretion. These signal 
peptides are encoded by the signal sequences located at the 5' ends of the coding sequences of genes 
encoding secreted proteins. Because these signal peptides will direct the extracellular secretion of 
any protein to which they are operably linked, the signal sequences may be exploited to direct the 
efficient secretion of any protein by operably linking the signal sequences to a gene encoding the 

30 protein for which secretion is desired. In addition, fragments of the signal peptides called 

membrane-translocating sequences, may also be used to direct the intracellular import of a peptide 
or protein of interest. This may prove beneficial in gene therapy strategies in which it is desired to 
deliver a particular gene product to cells other than the cells in which it is produced. Signal 
sequences encoding signal peptides also find application in simplifying protein purification 

35 techniques. In such applications, the extracellular secretion of the desired protein greatly facilitates 
purification by reducing the number of undesired proteins from which the desired protein must be 
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selected. Thus, there exists a need to identify and characterize the 5' fragments of the genes for 
secretory proteins which encode signal peptides. 

Sequences coding for human proteins may also find application as therapeutics or 
diagnostics. In particular, such sequences may be used to determine whether an individual is likely 
5 to express a detectable phenotype, such as a disease, as a consequence of a mutation in the coding 
sequence for a protein. In instances where the individual is at risk of suffering from a disease or 
other undesirable phenotype as a result of a mutation in such a coding sequence, the undesirable 
phenotype may be corrected by introducing a normal coding sequence using gene therapy. 
Alternatively, if the undesirable phenotype results from overexpression of the protein encoded by 
10 the coding sequence, expression of the protein may be reduced using antisense or triple helix based 
strategies. 

The GENSET human polypeptides encoded by the coding sequences may also be used as 
therapeutics by administering them directly to an individual having a condition, such as a disease, 
resulting from a mutation in the sequence encoding the polypeptide. In such an instance, the 

15 condition can be cured or ameliorated by administering the polypeptide to the individual. 

In addition, the human polypeptides or fragments thereof may be used to generate 
antibodies useful in determining the tissue type or species of origin of a biological sample. The 
antibodies may also be used to determine the subcellular localization of the human polypeptides or 
the cellular localization of polypeptides which have been fused to the human polypeptides. In 

20 addition, the antibodies may also be used in immunoaffinity chromatography techniques to isolate, 
purify, or enrich the human polypeptide or a target polypeptide which has been fused to the human 
polypeptide. 

Public information on the number of human genes for which the promoters and upstream 
regulatory regions have been identified and characterized is quite limited. In part, this may be due 

25 to the difficulty of isolating such regulatory sequences. Upstream regulatory sequences such as 
transcription factor binding sites are typically too short to be utilized as probes for isolating 
promoters from human genomic libraries. Recently, some approaches have been developed to 
isolate human promoters. One of them consists of making a CpG island library (Cross et al., Nature 
Genetics 6: 236-244, 1994). The second consists of isolating human genomic DNA sequences 

30 containing Spel binding sites by the use of Spel binding protein. (Mortlock et ai, Genome Res. 
6:327-335, 1996). Both of these approaches have their limits due to a lack of specificity and of 
comprehensiveness. Thus, there exists a need to identify and systematically characterize the 5' 
fragments of the genes. 

cDNAs including the 5' ends of their corresponding mRNA may be used to efficiently 

35 identify and isolate 5'UTRs and upstream regulatory regions which control the location, 

developmental stage, rate, and quantity of protein synthesis, as well as the stability of the mRNA 
(Theil et al y BioFactors 4:87-93, (1993). Once identified and characterized, these regulatory 

4 
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regions may be utilized in gene therapy or protein purification schemes to obtain the desired amount 
and locations of protein synthesis or to inhibit, reduce, or prevent the synthesis of undesirable gene 
products. 

In addition, cDNAs containing the 5* ends of protein genes may include sequences useful as 
5 probes for chromosome mapping and the identification of individuals. Thus, there is a need to 
identify and characterize the sequences upstream of the 5' coding sequences of genes encoding 
proteins. 

Sumnnary of the invention 

The present invention provides compositions containing a purified or isolated 

10 polynucleotide comprising, consisting of, or consisting essentially of a nucleotide sequence selected 
from the group consisting of: (a) the sequences of SEQ ID Nos: 1-241; (b) the sequences of clone 
inserts of the deposited clone pool; (c) the full coding sequences of SEQ ID Nos: 1-241; (d) the full 
coding sequences of the clone inserts of the deposited clone pool; (e) the sequences encoding one of 
the polypeptides of SEQ ID Nos: 242-482; (0 the sequences encoding one of the polypeptides 

15 encoded by the clone inserts of the deposited clone pool; (g) the genomic sequences coding for 
GENSET polypeptides; (h) the 5' transcriptional regulatory regions of GENSET genes; (i) the 3* 
transcriptional regulatory regions of GENSET genes; (j) the polynucleotides comprising the 
nucleotide sequence of any combination of (g)-(i); (k) the variant polynucleotides of any of the 
polynucleotides of (a)-(j); (I) the polynucleotides comprising a nucleotide sequence of (a)-(k), 

20 wherein the polynucleotide is single stranded, double stranded, or a portion is single stranded and a 
portion is double stranded; (m) the polynucleotides comprising a nucleotide sequence 
complementary to any of the single stranded polynucleotides of (1). The invention further provides 
for fragments of the nucleic acid molecules of (a)-(m) described above. 

The present invention also provides biologically active forms, variants, fragments and 

25 derivatives of the present proteins, where "biologically active" indicates that the form, variant, 
fragment, or derivative, has any detectable activity in any in vitro assay known in the art or 
described herein, or has any detectable function in vivo. In preferred embodiments, a determination 
of whether a particular polypeptide is biologically active will be made based on any of the specific 
assays or functional characteristics provided below for each of the proteins of this invention. 

30 Therefore, one embodiment of the present invention is a composition containing a purified 

or isolated nucleic acid comprising a sequence selected from the group consisting of sequences of 
SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited clone pool, sequences 
complementary thereto, allelic variants thereof, and degenerate variants thereof. In one aspect of 
this embodiment, the nucleic acid is recombinant. 

35 Another embodiment of the present invention is a composition containing a purified or 

isolated nucleic acid comprising at least 8 consecutive nucleotides of a sequence selected from the 

5 
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group consisting of sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the 
deposited clone pool, sequences complementary thereto, allelic variants thereof, and degenerate 
variants thereof. In one aspect of this embodiment, the nucleic acid comprises at least 10, 12, 15, 
18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 800, 1000, 1500, or 2000 
5 consecutive nucleotides of said selected sequence, sequences complentary thereto, allelic variants 
thereof, and degenerate variants thereof. The nucleic acid may be a recombinant nucleic acid. 

Another embodiment of the present invention is a composition comprising a vertebrate 
purified or isolated nucleic acid of at least 15,18, 20, 23, 25, 28, 30, 35, 40, 50, 75, 100, 200, 300, 
500, 1000 or 2000 nucleotides in length which hybridizes under stringent conditions to any 
10 polynucleotide of the invention, preferably a sequence selected from the group consisting of 
sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited clone pool, 
sequences complementary thereto. In one aspect of this embodiment, the nucleic acid is 
recombinant. 

Another embodiment of the present invention is a composition containing a purified or 
15 isolated nucleic acid comprising the full coding sequences of a sequence selected from the group 
consisting of sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited 
clone pool, or an allelic variant thereof. In one aspect of this embodiment, the nucleic acid is 
recombinant. 

A further embodiment of the present invention is a composition containing a purified or 

20 isolated nucleic acid comprising a contiguous span of a sequence selected from the group consisting 
of sequences of SEQ ID NOs: 1 -3 1 and 33-143 and sequences of clone inserts encoding secreted 
proteins in the deposited clone pool, or an allelic variant thereof, wherein said contiguous span 
encodes a mature protein. In one aspect of this embodiment, the nucleic acid is recombinant. In 
another aspect of this embodiment, the nucleic acid is an expression vector wherein said contiguous 

25 span which encodes a mature protein is operably linked to a promoter. 

Yet another embodiment of the present invention is a composition containing a purified or 
isolated nucleic acid comprising a contiguous span of a sequence selected from the group consisting 
of sequences of SEQ ID NOs: 1-31 and 33-143 and sequences of clone inserts encoding secreted 
proteins in the deposited clone pool, or an allelic variant thereof, wherein said contiguous span 

30 encodes a signal peptide. In one aspect of this embodiment, the nucleic acid is recombinant. In 
another aspect of this embodiment, the nucleic acid is an fusion vector wherein said contiguous 
span which encodes a signal peptide is operably linked to a second nucleic acid encoding an 
heterologous polypeptide. 

Another embodiment of the present invention is a composition containing a purified or 

35 isolated nucleic acid encoding a polypeptide comprising a sequence selected from the group 
consisting of sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited 

6 
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clone pool, or allelic variant thereof. In one aspect of this embodiment, the nucleic acid is 
recombinant. 

Another embodiment of the present invention is a composition containing a purified or 
isolated nucleic acid encoding a polypeptide comprising the sequence of a mature protein included 
5 in a sequence selected from the group consisting of sequences of SEQ ID NOs: 1-31 and 33-143 
and sequences of clone inserts encoding secreted proteins in the deposited clone pool, or allelic 
variant thereof. In one aspect of this embodiment, the nucleic acid is recombinant. 

Another embodiment of the present invention is a composition containing a purified or 
isolated nucleic acid encoding a polypeptide comprising the sequence of a signal peptide included 

10 in a sequence selected from the group consisting of sequences of SEQ ID NOs: 1-31 and 33-143 
and sequences of clone inserts encoding secreted proteins in the deposited clone pool, or allelic 
variant thereof. In another aspect it is present in a vector of the invention. 

Further embodiments of the invention include compositions containing purified or isolated 
polynucleotides that comprise, a nucleotide sequence at least 70% identical, more preferably at least 

15 75% identical, and still more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% 
identical to any of the polynucleotides of the present invention. Methods of determining identity 
include those well known in the art and described herein. Such analyses can be performed using a 
full length polynucleotide sequence or using a subsequence of any length. For example, any two 
sequences can be compared over a region, in either protein or in both proteins, of any 10, 25, 50, 

20 100, 250, 500, 1000, 2000 or more contiguous nucleotides. In addition, any two sequences can be 
identified as homologous even when they share sequence homology over a limited region of either 
polynucleotide, for example over a region of at least about 10, 25, 50, 100, 250, 500, 1000, or more 
contiguous nucleotides. 

The invention further provides compositions containing a purified or isolated polypeptide 

25 comprising, consisting of, or consisting essentially of an amino acid sequence selected from the 
group consisting of: (a) the polypeptides of SEQ ID Nos: 242-482; (b) the polypeptides encoded by 
the clone inserts of the deposited clone pool; (c) the epitope-bearing fragments of the polypeptides 
of SEQ ID Nos: 242-482; (d) the epitope-bearing fragments of the polypeptides encoded by the 
clone inserts contained in the deposited clone pool; (e) the domains of the polypeptides of SEQ ID 

30 Nos: 242-482; (0 the domains of the polypeptides encoded by the clone inserts contained in the 
deposited clone pool; and (g) the allelic variant polypeptides of any of the polypeptides of (a)-(f). 
The invention further provides for fragments of the polypeptides of (a)-(g) above, such as those 
having biological activity or comprising biologically functional domain(s). 

Yet another embodiment of the present invention is a composition containing a purified or 

35 isolated protein comprising a sequence selected from the group consisting of sequences of SEQ ID 
NOs: 242-482 and sequences of polypeptides encoded by clone inserts of the deposited clone pool, 
or allelic variant thereof. 

7 
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Another embodiment of the present invention is a composition containing a purified or 
isolated polypeptide comprising at least 5, 6 or 8 consecutive amino acids of a sequence selected 
from the group consisting of sequences of SEQ ID NOs: 242-482 and sequences of polypeptides 
encoded by clone inserts of the deposited clone pool, or allellic variant thereof. In one aspect of this 
5 embodiment, the purified or isolated polypeptide comprises at least 10, 12, 15, 20, 25, 30, 35, 40, 
50, 60, 75, 100, 150, 200, 250, 300, 350, 400, 450 or 500 consecutive amino acids of said selected 
sequence or allelic variant thereof. 

Another embodiment of the present invention is a composition containing an isolated or 
purified polypeptide comprising a signal peptide of a sequence selected from the group consisting 

10 of sequences of SEQ ED NOs: 242-272 and 274-384 and sequences of polypeptides encoded by 
clone inserts of the deposited clone pool, or allellic variant thereof. 

Yet another embodiment of the present invention is a composition containing an isolated or 
purified polypeptide comprising a mature protein of a sequence selected from the group consisting 
of sequences of SEQ ID NOs: 242-272 and 274-384 and sequences of polypeptides encoded by 

15 clone inserts of the deposited clone pool, or allellic variant thereof. 

A further embodiment of the present invention are compositions containing polypeptide 
having an amino acid sequence with at least 70% similarity, and more preferably at least 75%, 80%, 
85%, 90%, 95%, 96%, 97%, 98%, or 99% similarity to a polypeptide of the present invention, as 
well as polypeptides having an amino acid sequence at least 70% identical, more preferably at least 

20 75% identical, and still more preferably 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to 
a polypeptide of the present invention. Such analyses can be performed using a full length 
polypeptide sequence or using a subsequence of any length. For example, any two sequences can 
be compared over a region, in either protein or in both proteins, of any 10, 25, 50, 100, 250, 500, 
1000, 2000 or more contiguous amino acids. In addition, any two sequences can be identified as 

25 homologous even when they share sequence homology over a limited region of either protein, for 
example over a region of at least about 10, 25, 50, 100, 250, 500, 1000, or more contiguous amino 
acids. Further included in the invention are compositions comprising a purified or isolated nucleic 
acid molecule encoding such polypeptides. Methods for determining identity include those well 
known in the art and described herein. 

30 The present invention also relates to compositions comprising recombinant vectors, which 

include the purified or isolated polynucleotides of the present invention, and to host cells 
recombinant for the polynucleotides of the present invention, as well as to methods of making such 
vectors and host cells. The present invention further relates to the use of these recombinant vectors 
and recombinant host cells in the production of GENSET polypeptides. 

35 Consequently, another embodiment of the invention is a vector comprising any 

polynucleotide of the invention. In a preferred embodiment, the vector is an expression vector 
comprising a nucleic acid sequence encoding a polypeptide selected from the group consisting of 
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sequences of SEQ ED NOs: 242-482 and sequences of polypeptides encoded by the clone inserts of 
the deposited clone pool, or allelic variant thereof, wherein said nucleic acid sequence is operably 
linked to a promoter. In another preferred embodiment, the vector is a secretion vector comprising 
a nucleic acid sequence encoding a signal peptide selected from the group consisting of signal 
5 peptides of sequences of SEQ ID NOs: 242-272 and 274-384 and sequences of secreted 

polypeptides encoded by the clone inserts of the deposited clone pool, or allelic variant thereof, 
wherein said nucleic acid sequence is operably linked to an heterologous protein such that said 
signal peptide will direct the secretion of said heterolgous protein. 

A further embodiment of the present invention is a method of making a protein comprising 
10 a sequence selected from the group consisting of sequences of SEQ ID NOs: 242-482 and 

sequences of polypeptides encoded by clone inserts of the deposited clone pool, comprising the 
steps of 

a) obtaining a cDNA comprising a sequence selected from the group consisting of 
sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited clone pool; 
15 b) inserting said cDNA in an expression vector such that said cDNA is operably linked to a 

promoter; and 

c) introducing said expression vector into a host cell whereby the host cell produces the 
protein encoded by said cDNA. 

In one aspect of this embodiment, the method further comprises the step of isolating said 

20 protein. 

Another embodiment of the present invention is a protein obtainable by the method 
described in the preceding paragraph. 

Another embodiment of the present invention is a method of making a protein comprising 
the amino acid sequence of the mature protein contained in a sequence selected from the group 
25 consisting of sequences of SEQ ID NOs: 242-272 and 274-384 and sequences of polypeptides 
encoded by clone inserts of the deposited clone pool, comprising the steps of 

a) obtaining a cDNA comprising a sequence selected from the group consisting of 
sequences of SEQ ID NOs: 1-31 and 33-143 and sequences of clone inserts of the deposited clone 
pool, wherein said cDNA encodes a mature protein; 
30 b) inserting said cDNA in an expression vector such that said cDNA is operably linked to a 

promoter; and 

c) introducing said expression vector into a host cell whereby the host cell produces the 
mature protein encoded by said cDNA. 

In one aspect of this embodiment, the method further comprises the step of isolating said 

35 protein. 

Another embodiment of the present invention is a mature protein obtainable by the method 
described in the preceding paragraph. 
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Another embodiment of the present invention is a composition containing a host cell 
containing the purified or isolated nucleic acids comprising a sequence selected from the group 
consisting of sequences of SEQ ID NOs: 1-241 and sequences of clone inserts of the deposited 
clone pool or a sequence complementary thereto described herein. 
5 Another embodiment of the present invention is a composition containing a host cell 

containing the purified or isolated nucleic acids comprising the full coding sequences of a sequence 
selected from the group consisting of sequences of SEQ ED NOs: 1-241 and sequences of clone 
inserts of the deposited clone pool. 

Another embodiment of the present invention is a composition containing a host cell 
10 containing the purified or isolated nucleic acids comprising a contiguous span of a sequence 

selected from the group consisting of sequences of SEQ ID NOs: 1-31 and 33-143 and sequences of 
clone inserts of the deposited clone pool, wherein said contiguous span codes for a mature protein. 

Another embodiment of the present invention is a composition containing a host cell 
containing the purified or isolated nucleic acids comprising a contiguous span of a sequence 
15 selected from the group consisting of sequences of SEQ ID NOs: 1-31 and 33-143 and sequences of 
clone inserts of the deposited clone pool, wherein said contiguous span codes for a signal peptide. 

The invention further relates to other methods of making the polypeptides of the present 
invention. 

The present invention further relates to transgenic plants or animals, wherein said transgenic 
20 plant or animal is transgenic for a polynucleotide of the present invention and expresses a 
polypeptide of the present invention. 

The invention further relates to compositions comprising antibodies that specifically bind to 
the GENSET polypeptides of the present invention and fragments thereof as well as to methods for 
producing such antibodies and fragments thereof 
25 Therefore, another embodiment of the present invention is a composition containing a 

purified or isolated antibody capable of specifically binding to a protein comprising a sequence 
selected from the group consisting of sequences of SEQ ID NOs: 242-482 and sequences of 
polypeptides encoded by clone inserts of the deposited clone pool. In one aspect of this 
embodiment, the antibody is capable of binding to a polypeptide comprising at least 6 consecutive 
30 amino acids, at least 8 consecutive amino acids, or at least 10 consecutive amino acids of said 
selected sequence. 

The invention also provides kits and methods of detecting GENSET gene expression and/or 
biological activity in a biological sample. One such method involves assaying for the expression of 
a GENSET polynucleotide in a biological sample using polymerase chain reaction (PCR) to amplify 
35 and detect GENSET polynucleotides or Southern and Northern blot hybridization to detect 

GENSET genomic DNA, cDNA or mRNA. Alternatively, a method of detecting GENSET gene 
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expression in a test sample can be accomplished using a compound which binds to a GENSET 
polypeptide of the present invention or a portion of a GENSET polypeptide. 

The present invention also relates to diagnostic methods of identifying individuals or non- 
human animals having elevated or reduced levels of GENSET products, which individuals are 
5 likely to benefit from therapies to suppress or enhance GENSET gene expression, respectively and 
to methods of identifying individuals or non-human animals at increased risk for developing, or 
present state of having, certain diseases/disorders associated with GENSET gene abnormal 
expression or biological activity. 

The present invention also relates to kits and methods of screening compounds for their 
10 ability to modulate (e.g. increase or inhibit) the activity or expression of GENSET genes including 
compounds that interact with GENSET gene regulatory sequences and compounds that interact 
directly or indirectly with GENSET polypeptides. Uses of such compounds are also under the 
scope of the present invention. 

The present invention also relates to pharmaceutical or physiologically acceptable 
15 compositions comprising, an active agent, the polypeptides, polynucleotides or antibodies of the 
present invention. 

The present invention also relates to computer systems containing cDNA codes and 
polypeptides codes of sequences of the invention and to computer-related methods of comparing 
sequences, identifying homology or features using GENSET sequences of the invention. 

20 In another aspect, the present invention provides an isolated polynucleotide, said 

polynucleotide comprising a nucleic acid sequence encoding i) a polypeptide comprising an amino 
acid sequence having at least about 80% identity to any one of the sequences shown as SEQ ID 
NOs:242-482 or any one of the sequences of polypeptides encoded by the clone inserts of the 
deposited clone pool; or a biologically active fragment of said polypeptide. 

25 In one embodiment, the polypeptide comprises any one of the sequences shown as SEQ ID 

NOs:242-482 or any one of the sequences of the polypeptides encoded by the clone inserts of the 
deposited clone pool. In another embodiment, the polypeptide comprises a signal peptide. In 
another embodiment, the polypeptide is a mature protein. In another embodiment, the nucleic acid 
sequence has at least about 80% identity over at least about 100 contiguous nucleotides to any one 

30 of the sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of 
the deposited clone pool. In another embodiment, the polynucleotide hybridizes under stringent 
conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID NOs: 1-241 
or any one of the sequences of the clone inserts of the deposited clone pool. In another 
embodiment, the nucleic acid sequence comprises any one of the sequences shown as SEQ ID 

35 NOs: 1-241 or any one the sequences of the clone inserts of the deposited clone pool. In another 
embodiment, the polynucleotide is operably linked to a promoter. 
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In another aspect, the present invention provides an expression vector comprising the 
polynucleotide operably linked to a promoter. In another aspect, the present invention provides a 
host cell recombinant for the polynucleotide. In another aspect, the present invention provides a 
non-human transgenic animal comprising the host cell. 
5 In another aspect, the present invention provides a method of making a GENSET 

polypeptide, the method comprising a) providing a population of host cells comprising a herein- 
described polynucleotide and b) culturing the population of host cells under conditions conducive to 
the production of the polypeptide within said host cells. 

In one embodiment, the method further comprises purifying the polypeptide from the 
1 0 population of host cells. 

In another aspect, the present invention provides a method of making a GENSET 
polypeptide, the method comprising a) providing a population of cells comprising a herein- 
described polynucleotide; b) culturing the population of cells under conditions conducive to the 
production of the polypeptide within the cells; and c) purifying the polypeptide from the population 
15 of cells. 

In another aspect, the present invention provides an isolated polynucleotide, the 
polynucleotide comprising a nucleic acid sequence having at least about 80% identity over at least 
about 100 contiguous nucleotides to any one of the sequences shown as SEQ ID NOs: 1-241 or any 
one of the sequences of the clone inserts of the deposited clone pool. 

20 In one embodiment, the polynucleotide hybridizes under stringent conditions to a 

polynucleotide comprising any one of the sequences shown as SEQ ID NOs: 1-241 or any one of the 
sequences of the clone inserts of the deposited clone pool. In another embodiment, the 
polynucleotide comprises any one of the sequences shown as SEQ ID NOs: 1-241 or any one of the 
sequences of the clone inserts of the deposited clone pool. 

25 In another aspect, the present invention provides a biologically active polypeptide encoded 

by any of the herein-described polynucleotides. 

In another aspect, the present invention provides an isolated polypeptide or biologically 
active fragment thereof, the polypeptide comprising an amino acid sequence having at least about 
80% sequence identity to any one of the sequences shown as SEQ ID NOs:242-482 or any one of 

30 the sequences of polypeptides encoded by the clone inserts of the deposited clone pool. 

In one embodiment, the polypeptide is selectively recognized by an antibody raised against 
an antigenic polypeptide, or an antigenic fragment thereof, said antigenic polypeptide comprising 
any one of the sequences shown as SEQ ED NOs:242-482 or any one of the sequences of 
polypeptides encoded by the clone inserts of the deposited clone pool. In another embodiment, the 

35 polypeptide comprises any one of the sequences shown as SEQ ID NOs:242-482 or any one of the 
sequences of polypeptides encoded by the clone inserts of the deposited clone pool. In another 
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embodiment, the polypeptide comprises a signal peptide. In another embodiment, the polypeptide 
is a mature protein. 

In another aspect, the present invention provides an antibody that specifically binds to any 
of ther herein-described polypeptides. 
5 In another aspect, the present invention provides a method of determining whether a 

GENSET gene is expressed within a mammal, the method comprising the steps of: a) providing a 
biological sample from said mammal; b) contacting said biological sample with either of: i) a 
polynucleotide that hybridizes under stringent conditions to the polynucleotide of claim 1; or ii) a 
polypeptide that specifically binds to the polypeptide of claim 19; and c) detecting the presence or 
10 absence of hybridization between the polynucleotide and an RNA species within the sample, or the 
presence or absence of binding of the polypeptide to a protein within the sample; wherein a 
detection of the hybridization or of the binding indicates that the GENSET gene is expressed within 
the mammal. 

In one embodiment, the polynucleotide is a primer, and the hybridization is detected by 

1 5 detecting the presence of an amplification product comprising the sequence of the primer. In 
another embodiment, the polypeptide is an antibody. 

In another aspect, the present invention provides a method of determining whether a 
mammal has an elevated or reduced level of GENSET gene expression, the method comprising the 
steps of : a) providing a biological sample from the mammal; and b) comparing the amount of any 

20 of the herein-described polypeptides, or of an RNA species encoding the polypeptide, within the 
biological sample with a level detected in or expected from a control sample; wherein an increased 
amount of the polypeptide or the RNA species within the biological sample compared to the level 
detected in or expected from the control sample indicates that the mammal has an elevated level of 
the GENSET gene expression, and wherein a decreased amount of the polypeptide or the RNA 

25 species within the biological sample compared to the level detected in or expected from the control 
sample indicates that the mammal has a reduced level of the GENSET gene expression. 

In another aspect, the present invention provides a method of identifying a candidate 
modulator of a GENSET polypeptide, the method comprising : a) contacting any of the herein- 
described polypeptides with a test compound; and b) determining whether the compound 

30 specifically binds to the polypeptide; wherein a detection that the compound specifically binds to 
the polypeptide indicates that the compound is a candidate modulator of the GENSET polypeptide. 

Brief description of drawings 

Figure 1 is a map of the expression vector pPT 
35 Figure 2 is a block diagram of an exemplary computer system. 
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Figure 3 is a flow diagram illustrating one embodiment of a process 200 for comparing a 
new nucleotide or protein sequence with a database of sequences in order to determine the identity 
levels between the new sequence and the sequences in the database. 

Figure 4 is a flow diagram illustrating one embodiment of a process 250 in a computer for 
5 determining whether two sequences are homologous. 

Figure 5 is a flow diagram illustrating one embodiment of an identifier process 300 for 
detecting the presence of a feature in a sequence. 

Brief Description of Tables 

Table I provides the applicant's internal designation number assigned to each sequence 
10 identification number and indicates whether the sequence is a nucleic acid sequence or a 
polypeptide sequence, and in which vector the cDNA was cloned. 

Table II provides structural features for each cDNA of SEQ ID Nos: 1-241 i.e., the 
locations of the full coding sequences, the signal peptides, the mature polypeptides, the polyA 
signal and the polyA site. 
15 Table III lists variants for cDNAs of the present invention. 

Table IV provides the positions of fragments which are preferably excluded from the 
present invention. 

Tables Va and b provides the positions of fragments which are preferably excluded or 
included in the present invention. Table IV and Tables Va, and Table Vb provide for the inclusion 
20 and exclusion of polynucleotides independently from each other in addition to those described 
elsewhere in the specification and is therefore, not meant as limiting description. 

Table VI lists known biologically structural and functional domains for the polypeptides of 
the present invention. 

Table VII lists antigenic peaks of predicted antigenic epitopes for polypeptides of the 
25 present invention. 

Table VIII lists the putative chromosomal location of the polynucleotides of the present 
invention. 

Table DC list the Genset's cDNA libraries of tissues and cell types examined that express 
the polynucleotides of the present invention. 
30 Table X relates to the bias in spatial distribution of the polynucleotide sequences of the 

present invention. 

Table XI lists predicted subcellular localization for cDNAs of the present invention. 

Table XII gives the correspondence between the polynucleotides of the US priority 
applications, namely the US Provisional Patent Applications Serial Nos 60/169,629 and 60/187, 
35 (column entitled "Seq Id No in priority applications") and the polynucleotides of the present 
application (column entitled "Seq Id No in present application"). 
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Brief description of sequence listing 

SEQ ED Nos: 1-31 and 33-143 are the nucleotide sequences of cDNAs encoding a 
potentially secreted protein. The locations of the ORPs and sequences encoding signal peptides are 
listed in the accompanying Sequence Listing. In addition, the von Heijne score of the signal peptide 
5 computed as described below is listed as the "score" in the accompanying Sequence Listing. The 
sequence of the signal-peptide is listed as "seq" in the accompanying Sequence Listing. The in 
the signal peptide sequence indicates the location where proteolytic cleavage of the signal peptide 
occurs to generate a mature protein. When appropriate, the locations of the first and last nucleotides 
of the coding sequences, eventually the locations of the first and last nucleotides of the polyA and 

10 the locations of the first and last nucleotides of the polyA sites are indicated. 

SEQ ED Nos. 32 and 144-241 are the nucleotide sequences of cDNAs in which no sequence 
encoding a signal peptide has been identified to date. However, it remains possible that subsequent 
analysis will identify a sequence encoding a signal peptide in these nucleic acids. The locations of 
the ORFs are listed in the accompanying Sequence Listing. When appropriate, the locations of the 

15 first and last nucleotides of the coding sequences, eventually the locations of the first and last 
nucleotides of the polyA and the locations of the first and last nucleotides of the polyA sites are 
indicated. 

SEQ ED Nos: 242-272 and 274-384 are the amino acid sequences of polypeptides which 
contain a signal peptide. These polypeptides are encoded by the cDNAs of SEQ ID Nos: 1-3 1 and 
20 33-143 respectively. The location of the signal peptide is listed in the accompanying Sequence 
Listing. 

SEQ ED Nos: 273 and 385-482 are the amino acid sequences of polypeptides in which no 
signal peptide has been identified to date. However, it remains possible that subsequent analysis 
will identify a signal peptide in these polypeptides. These polypeptides are encoded by the nucleic 

25 acids of SEQ ED Nos: 32 and 144-241 respectively. 

In accordance with the regulations relating to Sequence Listings, the following codes have 
been used in the Sequence Listing to describes nucleotide sequences. The code "r" in the sequences 
indicates that the nucleotide may be a guanine or an adenine. The code "y" m *h e sequences 
indicates that the nucleotide may be a thymine or a cytosine. The code "m" in the sequences 

30 indicates that the nucleotide may be an adenine or a cytosine. The code "k" in the sequences 

indicates that the nucleotide may be a guanine or a thymine. The code "s" in the sequences indicates 
that the nucleotide may be a guanine or a cytosine. The code "w" in the sequences indicates that the 
nucleotide may be an adenine or an thymine. In addition, all instances of the symbol "n" in the 
nucleic acid sequences mean that the nucleotide can be adenine, guanine, cytosine or thymine. 

35 In some instances, the polypeptide sequences in the Sequence Listing contain the symbol 

"Xaa." These "Xaa" symbols indicate either (1) a residue which cannot be identified because of 
nucleotide sequence ambiguity or (2) a stop codon in the determined sequence where applicants 
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believe one should not exist (if the sequence were determined more accurately). In some instances, 
several possible identities of the unknown amino acids may be suggested by the genetic code. 

In the case of secreted proteins, it should be noted that, in accordance with the regulations 
governing Sequence Listings, in the appended Sequence Listing, the encoded protein (i.e. the 
5 protein containing the signal peptide and the mature protein or part thereof) extends from an amino 
acid residue having a negative number through a positively numbered amino acid residue. Thus, the 
first amino acid of the mature protein resulting from cleavage of the signal peptide is designated as 
amino acid number 1, and the first amino acid of the signal peptide is designated with the 
appropriate negative number. However, in the present application, positions on amino acid 
10 sequences are always given on the full length polypeptide, the first amino acid of the signal peptide 
being designated as amino acid number 1. 

Detailed description 

Definitions 

Before describing the invention in greater detail, the following definitions are set forth to 
15 illustrate and define the meaning and scope of the terms used to describe the invention herein. 

The terms " GENSET gene ", when used herein, encompasses genomic, mRNA and cDNA 
sequences encoding the GENSET protein, including the 5 C and 3' untranslated regions of said 
sequences. 

As used herein, a " secreted " protein is one which, when expressed in a suitable host cell, is 

20 transported across or through a membrane, including transport as a result of signal peptides in its 
amino acid sequence. "Secreted" proteins include without limitation proteins secreted wholly (e.g. 
soluble proteins), or partially (e.g. receptors) from the cell in which they are expressed. "Secreted" 
proteins also include without limitation proteins which are transported across the membrane of the 
endoplasmic reticulum. As used herein, a " mature protein " is the polypeptide fragment generated 

25 after the cleavage of the signal peptide. 

The term " full coding sequence " or open reading frame (ORF) of a GENSET gene, when 
used herein, refers to the complete coding sequence of said gene. In the case of a secreted protein, 
the full coding sequence comprises the coding sequence for the signal peptide and the coding 
sequence for the mature polypeptide. Accordingly, the term "full-length polypeptide" refers to the 

30 complete polypeptide encoded by said GENSET gene and in the case of a secreted protein it 
comprises both the signal peptide and the mature polypeptide. The positions of the full length 
polypeptides and, in the case of secreted proteins, of signal peptides and mature polypeptides are 
given in the appended sequence listing. 

The term " GENSET biological activity " is intended for polypeptides exhibiting an activity 

35 similar, but not necessarily identical, to an activity of the GENSET polypeptide of the invention. 
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The GENSET biological activity of a given polypeptide may be assessed using a suitable biological 
assay well known to those skilled in the art such as the one(s) described herein. In contrast, the 
term "biological activity" refers to any activity that a polypeptide of the invention may have. 

The term " corresponding mRNA " refers to the mRNA which was the template for the 
5 cDNA synthesis which produced a cDNA of the present invention. 

The term " corresponding genomic DNA " refers to the genomic DNA which encodes mRNA 
which includes the sequence of one of the strands of the cDNA in which thymidine residues in the 
sequence of the cDNA are replaced by uracil residues in the mRNA. 

The term " deposited clone pool " is used herein to refer to the pool of clones entitled 

1 0 GENSET.07 1 PRF deposited in ATCC with the accession number PTA-1 2 1 8 on January, 2 1 , 2000. 

The term " heterologous ", when used herein, is intended to designate any polynucleotide or 
polypeptide other than the GENSET polynucleotide or polypeptide respectively. 

The term " isolated " requires that the material be removed from its original environment (e. 
g., the natural environment if it is naturally occurring). For example, a naturally-occurring 

15 polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide 
or DNA or polypeptide, separated from some or all of the coexisting materials in the natural system, 
is isolated. Such polynucleotide could be part of a vector and/or such polynucleotide or polypeptide 
could be part of a composition, and still be isolated in that the vector or composition is not part of 
its natural environment. For example, a naturally-occurring polynucleotide present in a living 

20 animal is not isolated, but the same polynucleotide, separated from some or all of the coexisting 
materials in the natural system, is isolated. Specifically excluded from the definition of "isolated" 
are: naturally-occurring chromosomes (such as chromosome spreads), artificial chromosome 
libraries, genomic libraries, and cDNA libraries that exist either as an in vitro nucleic acid 
preparation or as a transfected/transformed host cell preparation, wherein the host cells are either an 

25 in vitro heterogeneous preparation or plated as a heterogeneous population of single colonies. Also 
specifically excluded are the above libraries wherein a specified polynucleotide makes up less than 
5% of the number of nucleic acid inserts in the vector molecules. Further specifically excluded are 
whole cell genomic DNA or whole cell RNA preparations (including said whole cell preparations 
which are mechanically sheared or enzymatically digested). Further specifically excluded are the 

30 above whole cell preparations as either an in vitro preparation or as a heterogeneous mixture 

separated by electrophoresis (including blot transfers of the same) wherein the polynucleotide of the 
invention has not further been separated from the heterologous polynucleotides in the 
electrophoresis medium (e.g., further separating by excising a single band from a heterogeneous 
band population in an agarose gel or nylon blot). 

35 The term " purified " does not require absolute purity; rather, it is intended as a relative 

definition. Purification of starting material or natural material to at least one order of magnitude, 
preferably two or three orders, and more preferably four or five orders of magnitude is expressly 
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contemplated. As an example, purification from 0.1 % concentration to 10 % concentration is two 
orders of magnitude. To illustrate, individual cDNA clones isolated from a cDNA library have been 
conventionally purified to electrophoretic homogeneity. The sequences obtained from these clones 
could not be obtained directly either from the library or from total human DNA. The cDNA clones 
5 are not naturally occurring as such, but rather are obtained via manipulation of a partially purified 
naturally occurring substance (messenger RNA). The conversion of mRNA into a cDNA library 
involves the creation of a synthetic substance (cDNA) and pure individual cDNA clones can be 
isolated from the synthetic library by clonal selection. Thus, creating a cDNA library from 
messenger RNA and subsequently isolating individual clones from that library results in an 

10 approximately 10 4 -10 6 fold purification of the native message. 

The term "purified" is further used herein to describe a polypeptide or polynucleotide of the 
invention which has been separated from other compounds including, but not limited to, 
polypeptides or polynucleotides, carbohydrates, lipids, etc. The term "purified" may be used to 
specify the separation of monomeric polypeptides of the invention from oligomeric forms such as 

15 homo- or hetero- dimers, trimers, etc. The term "purified" may also be used to specify the 

separation of covalently closed polynucleotides from linear polynucleotides. A polynucleotide is 
substantially pure when at least about 50%, preferably 60 to 75% of a sample exhibits a single 
polynucleotide sequence and conformation (linear versus covalently close). A substantially pure 
polypeptide or polynucleotide typically comprises about 50%, preferably 60 to 90% weight/weight 

20 of a polypeptide or polynucleotide sample, respectively, more usually about 95%, and preferably is 
over about 99% pure. Polypeptide and polynucleotide purity, or homogeneity, is indicated by a 
number of means well known in the art, such as agarose or polyacrylamide gel electrophoresis of a 
sample, followed by visualizing a single band upon staining the gel. For certain purposes higher 
resolution can be provided by using HPLC or other means well known in the art. As an alternative 

25 embodiment, purification of the polypeptides and polynucleotides of the present invention may be 
expressed as "at least" a percent purity relative to heterologous polypeptides and polynucleotides 
(DNA, RNA or both). As a preferred embodiment, the polypeptides and polynucleotides of the present 
invention are at least; 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 96%, 98%, 
99%, or 100% pure relative to heterologous polypeptides and polynucleotides, respectively. As a 

30 further preferred embodiment the polypeptides and polynucleotides have a purity ranging from any 
number, to the thousandth position, between 90% and 100% (e.g., a polypeptide or polynucleotide at 
least 99.995% pure) relative to either heterologous polypeptides or polynucleotides, respectively, or as 
a weight/weight ratio relative to all compounds and molecules other than those existing in the 
carrier. Each number representing a percent purity, to the thousandth position, may be claimed as 

35 individual species of purity. 

As used interchangeably herein, the terms " nucleic acid molecule(sY '. " oligonucleotide^) ", 
and " pol vnucleotidef s) " include RNA or DNA (either single or double stranded, coding, 
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complementary or anti sense), or RNA/DNA hybrid sequences of more than one nucleotide in either 
single chain or duplex form (although each of the above species may be particularly specified). The 
term " nucleotide " is used herein as an adjective to describe molecules comprising RN A, DNA, or 
RNA/DNA hybrid sequences of any length in single-stranded or duplex form. More precisely, the 
5 expression "nucleotide sequence" encompasses the nucleic material itself and is thus not restricted 
to the sequence information (i.e. the succession of letters chosen among the four base letters) that 
biochemically characterizes a specific DNA or RNA molecule. The term "nucleotide" is also used 
herein as a noun to refer to individual nucleotides or varieties of nucleotides, meaning a molecule, 
or individual unit in a larger nucleic acid molecule, comprising a purine or pyrimidine, a ribose or 

10 deoxyribose sugar moiety, and a phosphate group, or phosphodiester linkage in the case of 

nucleotides within an oligonucleotide or polynucleotide. The term "nucleotide" is also used herein 
to encompass "modified nucleotides" which comprise at least one modifications such as (a) an 
alternative linking group, (b) an analogous form of purine, (c) an analogous form of pyrimidine, or 
(d) an analogous sugar. For examples of analogous linking groups, purine, pyrimidines, and sugars 

15 see for example PCT publication No. WO 95/04064, which disclosure is hereby incorporated by 
reference in its entirety. Preferred modifications of the present invention include, but are not limited 
to, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xantine, 4- 
acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5- 
carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6- 

20 isopentenyladenine, 1 -methylguanine, 1 -methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2- 
methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5- 
methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'- 
methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5- 
oxyacetic acid (v) ybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2- 

25 thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid, 
5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, and 2,6-diaminopurine. The 
polynucleotide sequences of the invention may be prepared by any known method, including 
synthetic, recombinant, ex vivo generation, or a combination thereof, as well as utilizing any 
purification methods known in the art. Methylenemethylimino linked oligonucleosides as well as 

30 mixed backbone compounds having, may be prepared as described in U.S. Pat. Nos. 5,378,825; 
5,386,023; 5,489,677; 5,602,240; and 5,610,289, which disclosures are hereby incorporated by 
reference in their entireties. Formacetal and thioformacetal linked oligonucleosides may be prepared 
as described in U.S. Pat. Nos. 5,264,562 and 5,264,564, which disclosures are hereby incorporated 
by reference in their entireties. Ethylene oxide linked oligonucleosides may be prepared as 

35 described in U.S. Pat. No. 5,223,618, which disclosure is hereby incorporated by reference in its 
entirety. Phosphinate oligonucleotides may be prepared as described in U.S. Pat. No. 5,508,270, 
which disclosure is hereby incorporated by reference in its entirety. Alkyl phosphonate 
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oligonucleotides may be prepared as described in U.S. Pat. No. 4,469,863, which disclosure is 
hereby incorporated by reference in its entirety. 3'-Deoxy-3'-methylene phosphonate 
oligonucleotides may be prepared as described in U.S. Pat. Nos. 5,610,289 or 5,625,050 which 
disclosures are hereby incorporated by reference in their entireties. Phosphoramidite 
5 oligonucleotides may be prepared as described in U.S. Pat. No. 5,256,775 or U.S. Pat. No. 
5,366,878 which disclosures are hereby incorporated by reference in their entireties. 
Alkylphosphonothioate oligonucleotides may be prepared as described in published PCT 
applications WO 94/17093 and WO 94/02499 which disclosures are hereby incorporated by 
reference in their entireties. S'-Deoxy-S'-amino phosphoramidate oligonucleotides may be prepared 

10 as described in U.S. Pat. No. 5,476,925, which disclosure is hereby incorporated by reference in its 
entirety. Phosphotri ester oligonucleotides may be prepared as described in U.S. Pat. No. 5,023,243, 
which disclosure is hereby incorporated by reference in its entirety. Borano phosphate 
oligonucleotides may be prepared as described in U.S. Pat. Nos. 5,130,302 and 5,177,198 which 
disclosures are hereby incorporated by reference in their entireties. 

15 The term " upstream " is used herein to refer to a location which is toward the 5' end of the 

polynucleotide from a specific reference point. 

The terms " base paired " and " Watson & Crick base paired " are used interchangeably herein 
to refer to nucleotides which can be hydrogen bonded to one another be virtue of their sequence 
identities in a manner like that found in double-helical DNA with thymine or uracil residues linked 

20 to adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three 
hydrogen bonds (See Stryer, 1995, which disclosure is hereby incorporated by reference in its 
entirety). 

The terms " complementary " or " complement thereof are used herein to refer to the 
sequences of polynucleotides which is capable of forming Watson & Crick base pairing with 

25 another specified polynucleotide throughout the entirety of the complementary region. For the 

purpose of the present invention, a first polynucleotide is deemed to be complementary to a second 
polynucleotide when each base in the first polynucleotide is paired with its complementary base. 
Complementary bases are, generally, A and T (or A and U), or C and G. "Complement" is used 
herein as a synonym from "complementary polynucleotide", "complementary nucleic acid" and 

30 "complementary nucleotide sequence". These terms are applied to pairs of polynucleotides based 
solely upon their sequences and not any particular set of conditions under which the two 
polynucleotides would actually bind. Unless otherwise stated, all complementary polynucleotides 
are fully complementary on the whole length of the considered polynucleotide. 

The terms " polypeptide " and " protein ", used interchangeably herein, refer to a polymer of 

35 amino acids without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins 
are included within the definition of polypeptide. This term also does not specify or exclude 
chemical or post-expression modifications of the polypeptides of the invention, although chemical 
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or post-expression modifications of these polypeptides may be included excluded as specific 
embodiments. Therefore, for example, modifications to polypeptides that include the covalent 
attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like are 
expressly encompassed by the term polypeptide. Further, polypeptides with these modifications 
5 may be specified as individual species to be included or excluded from the present invention. The 
natural or other chemical modifications, such as those listed in examples above can occur anywhere 
in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or 
carboxyl termini. It will be appreciated that the same type of modification may be present in the 
same or varying degrees at several sites in a given polypeptide. Also, a given polypeptide may 

10 contain many types of modifications. Polypeptides may be branched, for example, as a result of 
ubiquitination, and they may be cyclic, with or without branching. Modifications include 
acetylation, acylation, ADP-ribosylation, amidatibn, covalent attachment of flavin, covalent 
attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent 
attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, 

15 cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation 
of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, 
proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, 
transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. 

20 (See, for instance Creighton (1993); Seifter et al., (1990); Rattan et aL, (1992)). Also included 
within the definition are polypeptides which contain one or more analogs of an amino acid 
(including, for example, non-naturally occurring amino acids, amino acids which only occur 
naturally in an unrelated biological system, modified amino acids from mammalian systems, etc.), 
polypeptides with substituted linkages, as well as other modifications known in the art, both 

25 naturally occurring and non-naturally occurring. 

As used herein, the terms " recombinant polynucleotide " and " polynucleotide construct " are 
used interchangeably to refer to linear or circular, purified or isolated polynucleotides that have 
been artificially designed and which comprise at least two nucleotide sequences that are not found 
as contiguous nucleotide sequences in their initial natural environment. In particular, this terms 

30 mean that the polynucleotide or cDNA is adjacent to "backbone" nucleic acid to which it is not 
adjacent in its natural environment. Additionally, to be "enriched" the cDNAs will represent 5% or 
more of the number of nucleic acid inserts in a population of nucleic acid backbone molecules. 
Backbone molecules according to the present invention include nucleic acids such as expression 
vectors, self-replicating nucleic acids, viruses, integrating nucleic acids, and other vectors or nucleic 

35 acids used to maintain or manipulate a nucleic acid insert of interest. Preferably, the enriched 
cDNAs represent 15% or more of the number of nucleic acid inserts in the population of 
recombinant backbone molecules. More preferably, the enriched cDNAs represent 50% or more of 
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the number of nucleic acid inserts in the population of recombinant backbone molecules. In a 
highly preferred embodiment, the enriched cDNAs represent 90% or more (including any number 
between 90 and 100%, to the thousandth position, e.g., 99.5%) # of the number of nucleic acid 
inserts in the population of recombinant backbone molecules. 
5 The term " recombinant polypeptide " is used herein to refer to polypeptides that have been 

artificially designed and which comprise at least two polypeptide sequences that are not found as 
contiguous polypeptide sequences in their initial natural environment, or to refer to polypeptides 
which have been expressed from a recombinant polynucleotide. 

As used herein, the term " operably linked " refers to a linkage of polynucleotide elements in 

10 a functional relationship. A sequence which is "operably linked" to a regulatory sequence such as a 
promoter means that said regulatory element is in the correct location and orientation in relation to 
the nucleic acid to control RNA polymerase initiation and expression of the nucleic acid of interest. 
For instance, a promoter or enhancer is operably linked to a coding sequence if it affects the 
transcription of the coding sequence. 

15 As used herein, the term " non-human animal " refers to any non-human animal, including 

insects, birds, rodents and more usually mammals. Preferred non-human animals include: primates; 
farm animals such as swine, goats, sheep, donkeys, cattle, horses, chickens, rabbits; and rodents, 
preferably rats or mice. As used herein, the term " animal " is used to refer to any species in the 
animal kingdom, preferably vertebrates, including birds and fish, and more preferable a mammal. 

20 Both the terms "animal" and "mammal" expressly embrace human subjects unless preceded with 
the term "non-human". 

The term " domain " refers to an amino acid fragment with specific biological properties. 
This term encompasses all known structural and linear biological motifs. Examples of such motifs 
include but are not limited to leucine zippers, helix-turn-helix motifs, glycosylation sites, 

25 ubiquitination sites, alpha helices, and beta sheets, signal peptides which direct the secretion of 

proteins, sites for post-translational modification, enzymatic active sites, substrate binding sites, and 
enzymatic cleavage sites. 

Although they have distinct meanings, the terms " comprising ", " consisting of * and 
" consisting essentially of 1 may be interchanged for one another throughout the instant application". 

30 The term "having" has the same meaning as "comprising" and may be replaced with either the term 
"consisting of 1 or "consisting essentially of. 

An "amplification product" refers to a product of any amplification reaction, e.g. PCR, RT- 
PCR, LCR, etc. 

A "modulator" of a protein or other compound refers to any agent that has a functional 
35 effect on the protein, including physical binding to the protein, alterations of the quantity or quality 
of expression of the protein, altering any measurable or detectable activity, property, or behavior of 
the protein, or in any way interacts with the protein or compound. 
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"A test compound" can be any molecule that is evaluated for its ability to modulate a 
protein or other compound. 

Unless otherwise specified in the application, nucleotides and amino acids of 
polynucleotides and polypeptides respectively of the present invention are contiguous and not 
5 interrupted by heterologous sequences. 

Identity Between Nucleic Acids Or Polypeptides 

The terms " percentage of sequence identity " and " percentage homology " are used 
interchangeably herein to refer to comparisons among polynucleotides and polypeptides, and are 
determined by comparing two optimally aligned sequences over a comparison window, wherein the 

10 portion of the polynucleotide or polypeptide sequence in the comparison window may comprise 
additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise 
additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by 
determining the number of positions at which the identical nucleic acid base or amino acid residue 
occurs in both sequences to yield the number of matched positions, dividing the number of matched 

15 positions by the total number of positions in the window of comparison and multiplying the result 
by 100 to yield the percentage of sequence identity. Homology is evaluated using any of the variety 
of sequence comparison algorithms and programs known in the art. Such algorithms and programs 
include, but are by no means limited to, TBLASTN, BLASTP, FASTA, TFASTA, CLUSTALW, 
FASTDB (Pearson and Lipman, 1988; Altschul et al 9 1990; Thompson et aL, 1994; Higgins et 

20 aL, 1996; Altschul et al., 1990; Altschul et aL, 1993; Brutlag et al y 1990), the disclosures of which 
are incorporated by reference in their entireties. 

In a particularly preferred embodiment, protein and nucleic acid sequence homologies are 
evaluated using the Basic Local Alignment Search Tool ("BLAST") which is well known in the art 
(see, e.g., Karlin and Altschul, 1990; Altschul et aL, 1990, 1993, 1997), the disclosures of which 

25 are incorporated by reference in their entireties. In particular, five specific BLAST programs are 
used to perform the following task: 

(1) BLASTP and BLAST3 compare an amino acid query sequence against a protein 
sequence database; 

(2) BLASTN compares a nucleotide query sequence against a nucleotide sequence 
30 database; 

(3) BLASTX compares the six-frame conceptual translation products of a query nucleotide 
sequence (both strands) against a protein sequence database; 

(4) TBLASTN compares a query protein sequence against a nucleotide sequence database 
translated in all six reading frames (both strands); and 

35 (5) TBLASTX compares the six -frame translations of a nucleotide query sequence against 

the six-frame translations of a nucleotide sequence database. 
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The BLAST programs identify homologous sequences by identifying similar segments, 
which are referred to herein as "high-scoring segment pairs," between a query amino or nucleic acid 
sequence and a test sequence which is preferably obtained from a protein or nucleic acid sequence 
database. High-scoring segment pairs are preferably identified (i.e., aligned) by means of a scoring 
5 matrix, many of which are known in the art. Preferably, the scoring matrix used is the BLOSUM62 
matrix (Gonnet et al. 9 1992; Henikoff and Henikoff, 1993), the disclosures of which are 
incorporated by reference in their entireties. Less preferably, the PAM or PAM250 matrices may 
also be used (see, e.g., Schwartz and Dayhoff, eds., 1978), the disclosure of which is incorporated 
by reference in its entirety. The BLAST programs evaluate the statistical significance of all high- 

10 scoring segment pairs identified, and preferably selects those segments which satisfy a user- 
specified threshold of significance, such as a user-specified percent homology. Preferably, the 
statistical significance of a high-scoring segment pair is evaluated using the statistical significance 
formula of Karlin (see, e.g., Karlin and Altschul, 1990), the disclosure of which is incorporated by 
reference in its entirety. The BLAST programs may be used with the default parameters or with 

1 5 modified parameters provided by the user. 

Another preferred method for determining the best overall match between a query 
nucleotide sequence (a sequence of the present invention) and a subject sequence, also referred to as 
a global sequence alignment, can be determined using the FASTDB computer program based on the 
algorithm of Brutlag et al. (1990), the disclosure of which is incorporated by reference in its 

20 entirety. In a sequence alignment the query and subject sequences are both DNA sequences. An 
RNA sequence can be compared by first converting U's to T*s. The result of said global sequence 
alignment is in percent identity. Preferred parameters used in a FASTDB alignment of DNA 
sequences to calculate percent identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty= 1, 
Joining Penalty=30, Randomization Group Length=0, Cutoff Score= 1, Gap Penalty=5, Gap Size 

25 Penalty 0.05, Window Size=500 or the length of the subject nucleotide sequence, whichever is 35 
shorter. If the subject sequence is shorter than the query sequence because of 5 % or 3' deletions, not 
because of internal deletions, a manual correction must be made to the results. This is because the 
FASTDB program does not account for 5' and 3* truncations of the subject sequence when 
calculating percent identity. For subject sequences truncated at the 5* or 3'ends, relative to the query 

30 sequence, the percent identity is corrected by calculating the number of bases of the query sequence 
that are 5* and 3' of the subject sequence, which are not matched/aligned, as a percent of the total 
bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of 
the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, 
calculated by the above FASTDB program using 10, the specified parameters, to arrive at a final 

35 percent identity score. This corrected score is what is used for the purposes of the present invention. 
Only nucleotides outside the 5* and 3' nucleotides of the subject sequence, as displayed by the 
FASTDB alignment, which are not matched/aligned with the query sequence, are calculated for the 
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purposes of manually adjusting the percent identity score. For example, a 90 nucleotide subject 
sequence is aligned to a 100 nucleotide query sequence to determine percent identity. The deletions 
occur at the 5' end of the subject sequence and therefore, the FASTDB alignment does not show a 
matched/alignment of the first 10 nucleotides at 5' end. The 10 unpaired nucleotides represent 10% 
5 of the sequence (number of nucleotides at the 5* and 3' ends not matched/total number of 

nucleotides in the query sequence) so 10% is subtracted from the percent identity score calculated 
by the FASTDB program. If the remaining 90 nucleotides were perfectly matched the final percent 
identity would be 90%. In another example, a 90 nucleotide subject sequence is compared with a 
100 nucleotide query sequence. This time the deletions are internal deletions so that there are no 
10 nucleotides on the 5 1 or 3 1 of the subject sequence which are not matched/aligned with the query. In 
this case the percent identity calculated by FASTDB is not manually corrected. Once again, only 
nucleotides 5' and 3' of the subject sequence which are not matched/aligned with the query sequence 
are manually corrected. No other manual corrections are made for the purposes of the present 
invention. 

15 Another preferred method for determining the best overall match between a query amino 

acid sequence (a sequence of the present invention) and a subject sequence, also referred to as a 
global sequence alignment, can be determined using the FASTDB computer program based on the 
algorithm of Brutlag et al. (1990). In a sequence alignment the query and subject sequences are both 
amino acid sequences. The result of said global sequence alignment is in percent identity. Preferred 

20 parameters used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch 
Penalty= 1, Joining Penalty=20, Randomization Group25Length=0, Cutoff Score= 1, Window 
Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window Size=500 or the length of 
the subject amino acid sequence, whichever is shorter. If the subject sequence is shorter than the 
query sequence due to N-or C-terminal deletions, not because of internal deletions, the results, in 

25 percent identity, must be manually corrected. This is because the FASTDB program does not 

account for N- and C-terminal truncations of the subject sequence when calculating global percent 
identity. For subject sequences truncated at the N- and C-termini, relative to the query sequence, 
the percent identity is corrected by calculating the number of residues of the query sequence that are 
N- and C- terminal of the subject sequence, which are not matched/aligned with a corresponding 

30 subject residue, as a percent of the total bases of the query sequence. Whether a residue is 

matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is 
then subtracted from the percent identity, calculated by the above FASTDB program using the 
specified parameters, to arrive at a final percent identity score. This final percent identity score is 
what is used for the purposes of the present invention. Only residues to the N- and C-termini of the 

35 subject sequence, which are not matched/aligned with the query sequence, are considered for the 
purposes of manually adjusting the percent identity score. That is, only query amino acid residues 
outside the farthest N- and C-terminal residues of the subject sequence. For example, a 90 amino 
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acid residue subject sequence is aligned with a 100-residue query sequence to determine percent 
identity. The deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB 
alignment does not match/align with the first residues at the N-terminus. The 10 unpaired residues 
represent 10% of the sequence (number of residues at the N- and C- termini not matched/total 
5 number of residues in the query sequence) so 10% is subtracted from the percent identity score 
calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final 
percent identity would be 90%. In another example, a 90-residue subject sequence is compared with 
a 100-residue query sequence. This time the deletions are internal so there are no residues at the N- 
or C-termini of the subject sequence, which are not matched/aligned with the query. In this case the 

10 percent identity calculated by FASTDB is not manually corrected. Once again, only residue 

positions outside the N- and C-terminal ends of the subject sequence, as displayed in the FASTDB 
alignment, which are not matched/aligned with the query sequence are manually corrected. No other 
manual corrections are made for the purposes of the present invention. 

The term " percentage of sequence similarity " refers to comparisons between polypeptide 

15 sequences and is determined by comparing two optimally aligned sequences over a comparison 
window, wherein the portion of the polypeptide sequence in the comparison window may comprise 
additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise 
additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by 
determining the number of positions at which an identical or equivalent amino acid residue occurs 

20 in both sequences to yield the number of matched positions, dividing the number of matched 

positions by the total number of positions in the window of comparison and multiplying the result 
by 100 to yield the percentage of sequence similarity. Similarity is evaluated using any of the 
variety of sequence comparison algorithms and programs known in the art, including those 
described above in this section. Equivalent amino acid residues are defined herein in the "Mutated 

25 polypeptides" section. 

Polynucleotides of the invention 

The present invention concerns GENSET genomic and cDNA sequences. The present 
invention encompasses GENSET genes, polynucleotides comprising GENSET genomic and cDNA 
sequences, as well as fragments and variants thereof. These polynucleotides may be purified, 

30 isolated, or recombinant. 

Also encompassed by the present invention are allelic variants, orthologs, splice variants, 
and/or species homologues of the GENSET genes. Procedures known in the art can be used to 
obtain full-length genes and cDNAs, allelic variants, splice variants, full-length coding portions, 
orthologs, and/or species homologues of genes and cDNAs corresponding to a nucleotide sequence 

35 selected from the group consisting of sequences of SEQ ID Nos: 1-241 and sequences of clone 
inserts of the deposited clone pool, using information from the sequences disclosed herein or the 
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clone pool deposited with the ATCC. For example, allelic variants, orthologs and/or species 
homologues may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for allelic variants and/or the 
desired homologue using any technique known to those skilled in the art including those described 
5 into the section entitled "To find similar sequences". 

In a specific embodiment, the polynucleotides of the invention are at least 15, 30, 50, 100, 
125, 500, or 1000 continuous nucleotides. In another embodiment, the polynucleotides are less than 
or equal to 300kb, 200kb, lOOkb, 50kb, lOkb, 7.5kb, 5kb, 2.5kb, 2kb, 1.5kb, or lkb in length. In a 
further embodiment, polynucleotides of the invention comprise a portion of the coding sequences, 
10 as disclosed herein, but do not comprise all or a portion of any intron. In another embodiment, the 
polynucleotides comprising coding sequences do not contain coding sequences of a genomic 
flanking gene (i.e., 5' or 3* to the gene of interest in the genome). In other embodiments, the 
polynucleotides of the invention do not contain the coding sequence of more than 1000, 500, 250, 
100, 75, 50, 25, 20, 15, 10, 5, 4, 3, 2, or 1 naturally occurring genomic flanking gene(s). 

15 Deposited clone pool of the invention 

Expression of GENSET genes has been shown to lead to the production of at least one 
mRNA species per GENSET gene, which cDNA sequence is set forth in the appended sequence 
listing as SEQ ED Nos: 1-241 . The cDNAs (SEQ ID Nos: 1-241) corresponding to these GENSET 
mRNA species were cloned in the vector pBluescriptll SK" (Stratagene) or one of its derivative 

20 called pPT (see figure 1). Cells containing the cloned cDNAs of the present invention are 

maintained in permanent deposit by the inventors at Genset, S.A., 24 Rue Royale, 75008 Paris, 
France. Table I provides the applicant's internal designation number (column entitled "Internal 
designation") assigned to each sequence identification number of SEQ ID Nos: 1-482 (column 
entitled "Seq Id No") and indicates whether the sequence is a nucleic acid sequence or a 

25 polypeptide sequence (column entitled "Type"), and in which vector the cDNA was cloned (column 
entitled "Vector"). 

Each cDNA can be removed from the Bluescript vector in which it was inserted by 
performing a NotI Pst I double digestion to produce the appropriate fragment for each clone 
provided the cDNA sequence of interest does not contain this restriction site within its sequence. 

30 The preferable sites for cDNA removal for those clones inserted into pPT are Muni and Hindlll, the 
sites used for cloning provided the cDNA sequence of interest does not contain this restriction site 
within its sequence. Alternatively, other restriction enzymes of the multicloning site of the vector 
may be used to recover the desired insert as indicated by the manufacturer or in figure 1 . 

Pool of cells containing the cDNAs of the invention, from which the cells containing a 

35 particular polynucleotide is obtainable, were also deposited with the American Tissue Culture 

Collection (ATCC), 10801 University Boulevard, Manassas, VA 201 10-2209, United States . Each 

27 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 PCT/IB00/01938 

cDNA clone has been transfected into separate bacterial cells (E-coli) for these composite deposits. 
In particular, cells containing the sequences of SEQ ID Nos: 1-241 were deposited on January, 21, 
2000 in the pool having ATCC Accession No. PTA-1218 and designated GENSET.071PRF. 

Bacterial cells containing a particular clone can be obtained from the composite deposit as 

5 follows: 

An oligonucleotide probe or probes should be designed to the sequence that is known for 
that particular clone. This sequence can be derived from the sequences provided herein, or from a 
combination of those sequences. The design of the oligonucleotide probe should preferably follow 
these parameters: 

10 (a) It should be designed to an area of the sequence which has the fewest ambiguous bases 

("N's"), if any; 

(b) Preferably, the probe is designed to have a Tm of approximately 80 degree Celsius 
(assuming 2 degrees for each A or T and 4 degrees for each G or C). However, probes having 
melting temperatures between 40 degree Celsius and 80 degree Celsius may also be used provided 

15 that specificity is not lost. 

The oligonucleotide should preferably be labeled with gamma[ 32 P]ATP (specific activity 
6000 Ci/mmole) and T4 polynucleotide kinase using commonly employed techniques for labeling 
oligonucleotides. Other labeling techniques can also be used. Unincorporated label should 
preferably be removed by gel filtration chromatography or other established methods. The amount 

20 of radioactivity incorporated into the probe should be quantified by measurement in a scintillation 
counter. Preferably, specific activity of the resulting probe should be approximately 4xl0 6 
dpm/pmole. 

The bacterial culture containing the pool of full-length clones should preferably be thawed 
and 100 ul of the stock used to inoculate a sterile culture flask containing 25 ml of sterile L-broth 

25 containing ampicillin at 100 ug/ml. The culture should preferably be grown to saturation at 37 
degree Celsius, and the saturated culture should preferably be diluted in fresh L-broth. Aliquots of 
these dilutions should preferably be plated to determine the dilution and volume which will yield 
approximately 5000 distinct and well-separated colonies on solid bacteriological media containing 
L-broth containing ampicillin at 100 ug/ml and agar at 1 .5% in a 150 mm petri dish when grown 

30 overnight at 37 degree Celsius. Other known methods of obtaining distinct, well-separated colonies 
can also be employed. 

Standard colony hybridization procedures should then be used to transfer the colonies to 
nitrocellulose filters and lyse, denature and bake them. 

The filter is then preferably incubated at 65 degree Celsius for 1 hour with gentle agitation 
35 in 6X SSC (20X stock is 175.3 g NaCl/liter, 88.2 g Na citrate/liter, adjusted to pH 7.0 with NaOH) 
containing 0.5% SDS, 100 pg/ml of yeast RNA, and 10 mM EDTA (approximately 10 ml per 150 
mm filter). Preferably, the probe is then added to the hybridization mix at a concentration greater 
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than or equal to lxlO 6 dpm/ml. The filter is then preferably incubated at 65 degree Celsius with 
gentle agitation overnight. The filter is then preferably washed in 500 ml of 2X SSC/0.1% SDS at 
room temperature with gentle shaking for 15 minutes. A third wash with 0.1X SSC/0.5% SDS at 65 
degree Celsius for 30 minutes to 1 hour is optional. The filter is then preferably dried and subjected 
5 to autoradiography for sufficient time to visualize the positives on the X-ray film. Other known 
hybridization methods can also be employed. 

The positive colonies are picked, grown in culture, and plasmid DNA isolated using 
standard procedures. The clones can then be verified by restriction analysis, hybridization analysis, 
or DNA sequencing. The plasmid DNA obtained using these procedures may then be manipulated 

10 using standard cloning techniques familiar to those skilled in the art. 

Alternatively, to recover cDNA inserts from the pool of bacteria, a PCR can be performed 
on plasmid DNA isolated using standard procedures and primers designed at both ends of the cDNA 
insertion, including primers designed in the multicloning site of the vector . For example, a PCR 
reaction may be conducted using universal primers designed by the plasmid provider or using 

1 5 primers which are specific to the cDNA of interest In the case of Bluescript SK(-), a PCR reaction 
may be conducted using a primer having the sequence GGAAACAGCTATGACCA and a primer 
having the sequence GTAAAACGACGGCCAGT. This will produce a DNA fragment including a 
piece of the multiple cloning site and the cDNA insert. If a specific cDNA of interest is to be 
recovered, primers may be designed in order to be specific for the 5' end and the 3' end of this 

20 cDNA using sequence information available from the appended sequence listing. The PCR product 
which corresponds to the cDNA of interest can then be manipulated using standard cloning 
techniques familiar to those skilled in the art. 

Therefore, an object of the invention is an isolated, purified, or recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of cDNA inserts of the 

25 deposited clone pool. Moreover, preferred polynucleotides of the invention include purified, 

isolated, or recombinant GENSET cDNAs consisting of, consisting essentially of, or comprising a 
nucleotide sequence selected from the group consisting of cDNA inserts of the deposited clone 
pool. 

The polynucleotides of SEQ ID NOs: 1-141 may be interchanged with the corresponding 
30 polynucleotides encoded by the human cDNA of the clones inserts of the deposited clone pool. The 
polypeptides of SEQ ED NOs: 242-482 maybe interchanged with the corresponding polypeptides 
encoded by the human cDNA of the clones inserts of the deposited clone pool. The correspondance 
between the polynucleotides of SEQ ID Nos: 1-141, the polypeptides of SEQ ID NOs: 242-482 and 
clones inserts of the deposited clone pool is given in Table I.. 
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cDNA sequences of the invention 

Another object of the invention is a purified, isolated, or recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of sequences of SEQ ID Nos: 
1-241, complementary sequences thereto, and fragments thereof. Moreover, preferred 
5 polynucleotides of the invention include purified, isolated, or recombinant GENSET cDNAs 

consisting of, consisting essentially of, or comprising a sequence selected from the group consisting 
ofSEQIDNos: 1-241. 

Polynucleotides GENSET sequences of SEQ ID Nos: 1-241 were then searched for open 
reading frames able to encode polypeptides. The GENSET ORFs were also searched to identify 

10 potential signal sequence motifs using slight modifications of the procedures disclosed in Von 

Heijne, Nucleic Acids Res. 14:4683-4690, 1986, as described in PCT publication WO 00/37491, the 
entire disclosures of which are incorporated herein by reference. The GENSET cDNAs of SEQ ID 
Nos: 1-31 and 33-143 encoding polypeptides of SEQ ID Nos: 242-272 and 274-384 were thus 
found as containing such signal sequences. 

15 Structural parameters of each of the cDNA of the present invention are described in Table 

II. Namely, Table II provides, for each cDNA of SEQ ID Nos: 1-241 referred to by its sequence 
identification number (column entitled "Seq Id No"), the locations of the first and last nucleotides 
of the coding sequences (listed under the heading "Full Coding Sequence"), and, if applicable, the 
locations of the signal sequence and the sequence encoding the mature polypeptide in the case of 

20 secreted proteins (SEQ ID Nos: 1-3 1 and 33-143) listed under the headings "Signal Sequence" and 
"Coding Sequence for the mature Protein" respectively, the locations of the first and last nucleotides 
of the polyA signals (listed under the heading "Poly A Signal") and the locations of the first and last 
nucleotides of the polyA sites (listed under the heading "Poly A Site"). 

Accordingly, the full coding sequence (CDS) or open reading frame (ORF) of each cDNA 

25 of the invention refers to the nucleotide sequence beginning with the first nucleotide of the start 
codon and ending with the last nucleotide of the stop codon (see column entiled "Full coding 
sequence" of Table II for sequences of Seq Id Nos: 1-241). Similarly, the signal sequence of each 
cDNA of the invention refers to the nucleotide sequence beginning with the first nucleotide of the 
start codon and ending with the last nucleotide of the codon encoding the signal peptide (see 

30 column entiled "Signal sequence" of Table II for sequences of Seq Id Nos: 1-31 and 33-143) and 
the coding sequence for the mature polypeptide of each cDNA of the invention refers to the 
nucleotide sequence beginning with the first nucleotide of the first codon encoding and ending with 
the last nucleotide of the stop codon (see column entiled "Coding sequence for mature protein" of 
Table II for sequences of Seq Id Nos: 1-31 and 33-143). Similarly, the 5 'untranslated region (or 

35 5'UTR) of each cDNA of the invention refers to the nucleotide sequence starting at nucleotide 1 
and ending at the nucleotide immediately 5' to the first nucleotide of the start codon. The 
3' untranslated region (or 3'UTR) of each cDNA of the invention refers to the nucleotide sequence 
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starting at the nucleotide immediately 3' to the last nucleotide of the stop codon and ending at the 
last nucleotide of the cDNA. 



Untranslated regions 

In addition, the invention concerns a purified, isolated, and recombinant nucleic acid 
5 comprising a nucleotide sequence selected from the group consisting of the 5'UTRs of sequences of 
SEQ ID Nos: 1-241 and sequences of clone inserts of the deposited clone pool, sequences 
complementary thereto, and allelic variants thereof. The invention also concerns a purified, 
isolated, and recombinant nucleic acid comprising a nucleotide sequence selected from the group 
consisting of the 3'UTRs of sequences of SEQ ID Nos: 1-241 and sequences of clone inserts of the 
10 deposited clone pool, sequences complementary thereto, and allelic variants thereof. 

These polynucleotides may be used to detect the presence of GENSET mRNA species in a 
biological sample using either hybridization or RT-PCR techniques well known to those skilled in 
the art those skilled in the art. 

In addition, these polynucleotides may be used as regulatory molecules able to affect the 
15 processing and maturation of the polynucleotide including them (either a GENSET polynucleotide 
or an heterologous polynucleotide), preferably the localization, stability and/or translation of said 
polynucleotide including them (for a review on UTRs see Decker and Parker, 1995, Derrigo et al., 
2000). In particular, 3'UTRs may be used in order to control the stability of heterologous mRNAs 
in recombinant vectors using any methods known to those skilled in the art including Makrides 
20 (1999), US Patents 5,925,56; 5,807,7 and 5,756,264, which disclosures are hereby incorporated by 
reference in their entireties. 

Coding sequences 

Another object of the invention is an isolated, purified or recombinant polynucleotide 
comprising the full coding sequence of a sequence selected from the group consisting of sequences 
25 of SEQ ID Nos: 1 -241 , clone inserts of the deposited clone pool, and variants thereof. 

A further object of the invention is an isolated, purified or recombinant polynucleotide 
encoding a polypeptide comprising a sequence selected from the group consisting of sequences of 
SEQ ID Nos: 242-482 and allelic variants thereof. Another object of the invention is an isolated, 
purified or recombinant polynucleotide encoding a polypeptide comprising a sequence selected 
30 from the group consisting of polypeptides encoded by cDNA inserts of the deposited clone pool and 
allelic variants thereof. 

In a preferred embodiment, the invention encompasses an isolated, purified or recombinant 
polynucleotide encoding a polypeptide comprising a sequence selected from the group consisting of 
the mature proteins of SEQ ID Nos: 242-272 and 274-384. In another preferred embodiment, the 
35 invention encompasses an isolated, purified or recombinant polynucleotide encoding a polypeptide 
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comprising a sequence selected from the group consisting of the signal peptides of SEQ ID Nos: 
242-272 and 274-384. 

It will be appreciated that should the extent of the full coding sequence differ from that 
indicated in the appended sequence listing as a result of a sequencing error, reverse transcription or 
5 amplification error, mRNA splicing, post-translational modification of the encoded protein, 

enzymatic cleavage of the encoded protein, or other biological factors, one skilled in the art would 
be readily able to identify the extent of the full coding sequences in the sequences of SEQ ID Nos: 
1-241. Accordingly, the scope of any claims herein relating to nucleic acids containing the full 
coding sequence of one of SEQ ID Nos: 1-241 is not to be construed as excluding any readily 

10 identifiable variations from or equivalents to the full coding sequences described in the appended 
sequence listing. Similarly, should the extent of the polypeptides differ from those indicated in the 
appended sequence listing as a result of any of the preceding factors, the scope of claims relating to 
polypeptides comprising the amino acid sequence of the polypeptides of SEQ ID Nos: 242-482 is 
not to be construed as excluding any readily identifiable variations from or equivalents to the 

1 5 sequences described in the appended sequence listing. 

It will be appreciated that should the extent of the coding sequence of the mature protein 
differ from that indicated in the appended sequence listing as a result of a sequencing error, reverse 
transcription or amplification error, mRNA splicing, post-translational modification of the encoded 
protein, enzymatic cleavage of the encoded protein, or other biological factors, one skilled in the art 

20 would be readily able to identify the extent of the coding sequences for the mature protein in the 
sequences of SEQ ID Nos: 1-31 and 33-143. Accordingly, the scope of any claims herein relating 
to nucleic acids containing the coding sequence for the mature proteins of one of SEQ ED Nos: 1-31 
and 33-143 is not to be construed as excluding any readily identifiable variations from or 
equivalents to the coding sequences described in the appended sequence listing. Similarly, should 

25 the extent of the mature polypeptides differ from those indicated in the appended sequence listing as 
a result of any of the preceding factors, the scope of claims relating to mature polypeptides 
comprising the amino acid sequence of the polypeptides of SEQ ID Nos: 242-272 and 274-384 is 
not to be construed as excluding any readily identifiable variations from or equivalents to the 
sequences described in the appended sequence listing. 

30 It will be appreciated that should the extent of the coding sequence of the signal peptide 

differ from that indicated in the appended sequence listing as a result of a sequencing error, reverse 
transcription or amplification error, mRNA splicing, post-translational modification of the encoded 
protein, enzymatic cleavage of the encoded protein, or other biological factors, one skilled in the art 
would be readily able to identify the extent of the coding sequences for the signal peptide in the 

35 sequences of SEQ ED Nos: 1-31 and 33-143. Accordingly, the scope of any claims herein relating 
to nucleic acids containing the signal sequence of one of SEQ ED Nos: 1-31 and 33-143 is not to be 
construed as excluding any readily identifiable variations from or equivalents to the coding 
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sequences described in the appended sequence listing. Similarly, should the extent of the signal 
peptides differ from those indicated in the appended sequence listing as a result of any of the 
preceding factors, the scope of claims relating to signal peptides comprising the amino acid 
sequence of the polypeptides of SEQ ID Nos: 242-272 and 274-384 is not to be construed as 
5 excluding any readily identifiable variations from or equivalents to the sequences described in the 
appended sequence listing. 

The above disclosed polynucleotides that contains the coding sequence (for the full-length 
protein of for the mature protein) of the GENSET genes may be expressed in a desired host cell or a 
desired host organism, when this polynucleotide is placed under the control of suitable expression 

10 signals. The expression signals may be either the expression signals contained in the regulatory 
regions in the GENSET genes of the invention or in contrast the signals may be exogenous 
regulatory nucleic sequences. Such a polynucleotide, when placed under the suitable expression 
signals, may also be inserted in a vector for its expression and/or amplification. 

Further included in the present invention are polynucleotides encoding the polypeptides of 

15 the present invention that are fused in frame to the coding sequences for additional heterologous 
amino acid sequences. Of special interest are polynucleotides comprising GENSET signal 
sequences fused to an heterologous polypeptide as described in the section entitled "Secretion 
vectors". Also included in the present invention are nucleic acids encoding polypeptides of the 
present invention together with additional, non-coding sequences, including for example, but not 

20 limited to non-coding 5' and 3' sequences, vector sequence, sequences used for purification, 
probing, or priming. For example, heterologous sequences include transcribed, untranslated 
sequences that may play a role in transcription, and mRNA processing, for example, ribosome 
binding and stability of mRNA. The heterologous sequences may alternatively comprise additional 
coding sequences that provide additional functionalities. Thus, a nucleotide sequence encoding a 

25 polypeptide may be fused to a tag sequence, such as a sequence encoding a peptide that facilitates 
purification of the fused polypeptide. In certain preferred embodiments of this aspect of the 
invention, the tag amino acid sequence is a hexa-histidine peptide, such as the tag provided in a 
pQE vector (QIAGEN), among others, many of which are commercially available. For instance, 
hexa-histidine provides for convenient purification of the fusion protein (See Gentz et al. 9 1989), the 

30 disclosure of which is incorporated by reference in its entirety. The "HA" tag is another peptide 
useful for purification which corresponds to an epitope derived from the influenza hemagglutinin 
protein (See Wilson et al., 1984), the disclosure of which is incorporated by reference in its entirety. 
As discussed below other such fusion proteins include the GENSET protein fused to Fc at the N- or 
C-terminus. 

35 Suitable recombinant vectors that contain a polynucleotide such as described herein are 

disclosed elsewhere in the specification. Expression vectors encoding GENSET polypeptides or 
fragments thereof are described in the section entitled "Preparation of the polypeptides". 
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Regulatory sequences of the invention 

As mentioned, the genomic sequence of GENSET genes contains regulatory sequences in 
the non-coding 5'-flanking region and possibly in the non-coding 3'-flanking region that border the 
GENSET coding regions containing the exons of these genes. 
5 Polynucleotides derived from GENSET 5' and 3' regulatory regions are useful in order to 

detect the presence of at least a copy of a genomic nucleotide sequence of the GENSET gene or a 
fragment thereof in a test sample. 

Preferred regulatory sequences 

Polynucleotides carrying the regulatory elements located at the 5' end and at the 3' end of 
10 GENSET coding regions may be advantageously used to control the transcriptional and 
translational activity of a heterologous polynucleotide of interest. 

Thus, the present invention also concerns a purified or isolated nucleic acid comprising a 
polynucleotide which is selected from the group consisting of the 5' and 3' GENSET regulatory 
regions, sequences complementary thereto, regulatory active fragments and variants thereof. The 
15 invention also pertains to a purified or isolated nucleic acid comprising a polynucleotide having at 
least 95% nucleotide identity with a polynucleotide selected from the group consisting of GENSET 
5' and 3' regulatory regions, advantageously 99 % nucleotide identity, preferably 99.5% nucleotide 
identity and most preferably 99.8% nucleotide identity with a polynucleotide selected from the 
group consisting of GENSET 5' and 3' regulatory regions, sequences complementary thereto, 
20 variants and regulatory active fragments thereof. 

Another object of the invention consists of purified, isolated or recombinant nucleic acids 
comprising a polynucleotide that hybridizes, under the stringent hybridization conditions defined 
herein, with a polynucleotide selected from the group consisting of the nucleotide sequences of 
GENSET 5'- and 3' regulatory regions, sequences complementary thereto, variants and regulatory 
25 active fragments thereof. 

Preferred fragments of 5' regulatory regions have a length of about 1500 or 1000 
nucleotides, preferably of about 500 nucleotides, more preferably about 400 nucleotides, even more 
preferably 300 nucleotides and most preferably about 200 nucleotides. 

Preferred fragments of 3' regulatory regions are at least 20, 50, 100, 150, 200, 300 or 400 
30 bases in length. 

"Providing" with respect to, e.g. a biological sample, population of cells, etc. indicates that 
the sample, population of cells, etc. is somehow used in a method or procedure. Significantly, 
"providing" a biological sample or population of cells does not require that the sample or cells are 
specifically isolated or obtained for the purposes of the invention, but can instead refer, for 
35 example, to the use of a biological sample obtained by another individual, for another purpose. 
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" Regulatory active " polynucleotide derivatives of the 5' regulatory region are 
polynucleotides comprising or alternatively consisting of a fragment of said polynucleotide which is 
functional as a regulatory region for expressing a recombinant polypeptide or a recombinant 
polynucleotide in a recombinant cell host. It could act either as an enhancer or as a repressor. For 
5 the purpose of the invention, a nucleic acid or polynucleotide is "functional" as a regulatory region 
for expressing a recombinant polypeptide or a recombinant polynucleotide if said regulatory 
polynucleotide contains nucleotide sequences which contain transcriptional and translational 
regulatory information, and such sequences are "operably linked" to nucleotide sequences which 
encode the desired polypeptide or the desired polynucleotide. 

10 The regulatory polynucleotides of the invention may be prepared from the nucleotide 

sequence of GENSET genomic or cDNA sequence, for example, by cleavage using suitable 
restriction enzymes, or by PCR. The regulatory polynucleotides may also be prepared by digestion 
of a GENSET gene containing genomic clone by an exonuclease enzyme, such as Bal31 (Wabiko et 
at., 1986), the disclosure of which is incorporated by reference in its entirety. These regulatory 

15 polynucleotides can also be prepared by nucleic acid chemical synthesis, as described elsewhere in 
the specification. 

The regulatory polynucleotides according to the invention may be part of a recombinant 
expression vector that may be used to express a full coding sequence in a desired host cell or host 
organism. The recombinant expression vectors according to the invention are described elsewhere 
20 in the specification. 

Preferred 5'-reguIatory polynucleotide of the invention include 5'-UTRs of GENSET 
cDNAs, or regulatory active fragments or variants thereof. More preferred S'-regulatory 
polynucleotides of the invention include sequences selected from the group consisting of 5'-UTRs 
of sequences of SEQ ID Nos: 1-241, 5'-UTRs of clones inserts of the deposited clone pool, 
25 regulatory active fragments and variants thereof. 

Preferred 3 '-regulatory polynucleotide of the invention include 3'-UTRs of GENSET 
cDNAs, or regulatory active fragments or variants thereof. More preferred 3'-regulatory 
polynucleotides of the invention include sequences selected from the group consisting of 3'-UTRs 
of sequences of SEQ ID Nos: 1-241, 3'-UTRs of clones inserts of the deposited clone pool, 
30 regulatory active fragments and variants thereof. 

A further object of the invention consists of a purified or isolated nucleic acid comprising: 

a) a polynucleotide comprising a 5' regulatory nucleotide sequence selected from the group 
consisting of: 

(i) a nucleotide sequence comprising a polynucleotide of a GENSET 5 s regulatory region or 
35 a complementary sequence thereto; 
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(ii) a nucleotide sequence comprising a polynucleotide having at least 95% of nucleotide 
identity with the nucleotide sequence of a GENSET 5' regulatory region or a complementary 
sequence thereto; 

(iii) a nucleotide sequence comprising a polynucleotide that hybridizes under stringent 
5 hybridization conditions with the nucleotide sequence of a GENSET 5' regulatory region or a 

complementary sequence thereto; and 

(iv) a regulatory active fragment or variant of the polynucleotides in (i), (ii) and (iii); 

b) a nucleic acid molecule encoding a desired polypeptide or a nucleic acid molecule of 
interest, said nucleic acid molecule is operably linked to the polynucleotide defined in (a); and 
10 c) optionally, a polynucleotide comprising a 3'- regulatory polynucleotide, preferably a 3'- 

regulatory polynucleotide of a GENSET gene. 

In a specific embodiment, the nucleic acid defined above includes the 5'-UTR of a 
GENSET cDNA, or a regulatory active fragment or variant thereof. 

In a second specific embodiment, the nucleic acid defined above includes the 3'-UTR of a 
1 5 GENSET cDNA, or a regulatory active fragment or variant thereof 

The regulatory polynucleotide of the 5' regulatory region, or its regulatory active fragments 
or variants, is operably linked at the 5 '-end of the nucleic acid molecule encoding the desired 
polypeptide or nucleic acid molecule of interest. 

The regulatory polynucleotide of the 3 5 regulatory region, or its regulatory active fragments 
20 or variants, is advantageously operably linked at the 3 '-end of the nucleic acid molecule encoding 
the desired polypeptide or nucleic acid molecule of interest. 

The desired polypeptide encoded by the above-described nucleic acid may be of various 
nature or origin, encompassing proteins of prokaryotic viral or eukaryotic origin. Among the 
polypeptides expressed under the control of a GENSET regulatory region include bacterial, fungal 
25 or viral antigens. Also encompassed are eukaryotic proteins such as intracellular proteins, such as 
"house keeping" proteins, membrane-bound proteins, such as mitochondrial membrane-bound 
proteins and cell surface receptors, and secreted proteins such as endogenous mediators such as 
cytokines. The desired polypeptide may be an heterologous polypeptide or a GENSET protein, 
especially a protein with an amino acid sequence selected from the group consisting of sequences of 
30 SEQ ID Nos: 242-482, fragments and variants thereof 

The desired nucleic acids encoded by the above-described polynucleotide, usually an RNA 
molecule, may be complementary to a desired coding polynucleotide, for example to a GENSET 
coding sequence, and thus useful as an antisense polynucleotide. Such a polynucleotide may be 
included in a recombinant expression vector in order to express the desired polypeptide or the 
35 desired nucleic acid in host cell or in a host organism. Suitable recombinant vectors that contain a 
polynucleotide such as described herein are disclosed elsewhere in the specification. When a 
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polynucleotide sequence has been recombinantly introduced into a host cell, the cell is said to be 
"recombinant" for the polynucleotide. 

Polynucleotide variants 

The invention also relates to variants of the polynucleotides described herein and fragments 
5 thereof. " Variants" of polynucleotides, as the term is used herein, are polynucleotides that differ 
from a reference polynucleotide. Generally, differences are limited so that the nucleotide sequences 
of the reference and the variant are closely similar overall and, in many regions, identical. The 
present invention encompasses both allelic variants and degenerate variants. 

Examples of variant sequences of polynucleotides of the invention are given in the 
10 appended sequence listing. Table III lists the sequence identification number of all similar 
sequences of the sequence listing, namely variants. All cDNAS referred to by their sequence 
identification number on a given line of the table are thought to be variants of the same GENSET 
gene. 

Allelic variant 

15 A variant of a polynucleotide may be a naturally occurring variant such as a naturally 

occurring allelic variant, or it may be a variant that is not known to occur naturally. By an " allelic 
variant " is intended one of several alternate forms of a gene occupying a given locus on a 
chromosome of an organism (see Lewin, 1990), the disclosure of which is incorporated by reference 
in its entirety. Diploid organisms may be homozygous or heterozygous for an allelic form. Non- 
20 naturally occurring variants of the polynucleotide may be made by art-known mutagenesis 
techniques, including those applied to polynucleotides, cells or organisms. 

Degenerate variant 

In addition to the isolated polynucleotides of the present invention, and fragments thereof, 
the invention further includes polynucleotides which comprise a sequence substantially different 
25 from those described above but which, due to the degeneracy of the genetic code, still encode a 
GENSET polypeptide of the present invention. These polynucleotide variants are referred to as 
" degenerate variants " throughout the instant application. That is, all possible polynucleotide 
sequences that encode the GENSET polypeptides of the present invention are completed. This 
includes the genetic code and species-specific codon preferences known in the art. Thus, it would be 
30 routine for one skilled in the art to generate the degenerate variants described above, for instance, to 
optimize codon expression for a particular host (e.g., change codons in the human mRNA to those 
preferred by other mammalian or bacterial host cells). 

Nucleotide changes present in a variant polynucleotide may be silent, which means that 
they do not alter the amino acids encoded by the polynucleotide. However, nucleotide changes may 
35 also result in amino acid substitutions, additions, deletions, fusions and truncations in the 
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polypeptide encoded by the reference sequence. The substitutions, deletions or additions may 
involve one or more nucleotides. The variants may be altered in coding or non-coding regions or 
both. Alterations in the coding regions may produce conservative or non-conservative amino acid 
substitutions, deletions or additions. In the context of the present invention, preferred embodiments 
5 are those in which the polynucleotide variants encode polypeptides which retain substantially the 
same biological properties or activities as the GENSET protein. More preferred polynucleotide 
variants are those containing conservative substitutions. 

Similar polynucleotides 

Other embodiments of the present invention is a purified, isolated or recombinant 

10 polynucleotide which is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a polynucleotide 
selected from the group consisting of sequences of SEQ ID Nos: 1-241 and clone inserts of the 
deposited clone pool. The above polynucleotides are included regardless of whether they encode a 
polypeptide having a GENSET biological activity. This is because even where a particular nucleic 
acid molecule does not encode a polypeptide having activity, one of skill in the art would still know 

15 how to use the nucleic acid molecule, for instance, as a hybridization probe or primer. Uses of the 
nucleic acid molecules of the present invention that do not encode a polypeptide having GENSET 
activity include, inter alia, isolating a GENSET gene or allelic variants thereof from a DNA library, 
and detecting GENSET mRNA expression in biological samples, suspected of containing GENSET 
mRNA or DNA by Northern Blot or PCR analysis. 

20 The present invention is further directed to polynucleotides having sequences at least 50%. 

60%, 70%, 80%, 90%, 95%, 96%, 97%, 98% or 99% identity to a polynucleotide selected from the 
group consisting of sequences of SEQ ID Nos: 1-241 and clone inserts of the deposited clone pool, 
where said polynucleotide do, in fact, encode a polypeptide having a GENSET biological activity. 
Of course, due to the degeneracy of the genetic code, one of ordinary skill in the art will 

25 immediately recognize that a large number of the polynucleotides at least 50%. 60%, 70%, 80%, 
90%, 95%, 96%, 97%, 98%, or 99% identical to a polynucleotide selected from the group 
consisting of sequences of SEQ ID Nos: 1-241 and clone inserts of the deposited clone pool will 
encode a polypeptide having biological activity. In fact, since degenerate variants of these 
nucleotide sequences all encode the same polypeptide, this will be clear to the skilled artisan even 

30 without performing the above described comparison assay. It will be further recognized in the art 
that, for such nucleic acid molecules that are not degenerate variants, a reasonable number will also 
encode a polypeptide having biological activity. This is because the skilled artisan is fully aware of 
amino acid substitutions that are either less likely or not likely to significantly effect protein 
function (e.g., replacing one aliphatic amino acid with a second aliphatic amino acid), as further 

35 described below. By a polynucleotide having a nucleotide sequence at least, for example, 95% 
"identical" to a reference nucleotide sequence of the present invention, it is intended that the 
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nucleotide sequence of the polynucleotide is identical to the reference sequence except that the 
polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the 
reference nucleotide sequence encoding the GENSET polypeptide. In other words, to obtain a 
polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide 
5 sequence, up to 5% of the nucleotides in the reference sequence may be deleted, inserted, or 
substituted with another nucleotide. The query sequence may be an entire sequence selected from 
the group consisting of sequences of SEQ ID Nos: 1-241 and sequences of clone inserts of the 
deposited clone pool, or the ORF (open reading frame) of a polynucleotide sequence selected from 
said group, or any fragment specified as described herein. 

10 Hybridizing Polynucleotides 

In another aspect, the invention provides an isolated or purified nucleic acid molecule 
comprising a polynucleotide which hybridizes under stringent hybridization conditions to any 
polynucleotide of the present invention using any methods known to those skilled in the art 
including those disclosed herein and in particular in the "To find similar sequences" section. Also 

15 contemplated are nucleic acid molecules that hybridize to the polynucleotides of the present 
invention at lower stringency hybridization conditions, preferably at moderate or low stringency 
conditions as defined herein. Such hybridizing polynucleotides may be of at least 15,18, 20, 23, 25, 
28, 30, 35, 40, 50, 75, 100, 200, 300, 500, 1000 or 2000 nucleotides in length. 

Of particular interest, are the polynucleotides hybridizing to any polynucleotide of the 

20 invention and encoding GENSET polypeptides, particularly GENSET polypeptides exhibiting a 
GENSET biological activity. 

Of course, a polynucleotide which hybridizes only to polyA+ sequences (such as any 3' 
terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 5' complementary stretch 
of T (or U) residues, would not be included in the definition of "polynucleotide," since such a 

25 polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the 
complement thereof (e.g., practically any double-stranded cDNA clone generated using oligo dT as 
a primer). 

Complementary polynucleotides 

The invention further provides isolated nucleic acid molecules having a nucleotide sequence 

30 fully complementary to any polynucleotide of the invention. The present invention encompasses a 
purified, isolated or recombinant polynucleotide having a nucleotide sequence complementary to a 
sequence selected from the group consisting of sequences of SEQ ID Nos: 1-241, sequences of 
clone inserts of the deposited clone pool and fragments thereof. Such isolated molecules, 
particularly DNA molecules, are useful as probes for gene mapping and for identifying GENSET 

35 mRNA in a biological sample, for instance, by PCR or Northern blot analysis. 
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Polynucleotides fragments 

The present invention is further directed to polynucleotides encoding portions or fragments 
of the nucleotide sequences described herein. Uses for the polynucleotide fragments of the present 
invention include probes, primers, molecular weight markers and for expressing the polypeptide 
5 fragments of the present invention. Fragments include portions of polynucleotides selected from the 
group consisting of a) the sequences of SEQ ID Nos: 1-241, b) the genomic GENSET sequences, c) 
the polynucleotides encoding a polypeptide selected from the group consisting of the sequences of 
SEQ ID Nos: 242-482, d) the sequences of clone inserts of the deposited clone pool, and e) the 
polynucleotides encoding the polypeptides encoded by the clone inserts of the deposited clone pool. 
10 Particularly included in the present invention is a purified or isolated polynucleotide comprising at 
least 8 consecutive bases of a polynucleotide of the present invention. In one aspect of this 
embodiment, the polynucleotide comprises at least 10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 
100, 150, 200, 300, 400, 500, 800, 1000, 1500, or 2000 consecutive nucleotides of a polynucleotide 
of the present invention. 

15 In addition to the above preferred polynucleotide sizes, further preferred sub-genuses of 

polynucleotides comprise at least 8 nucleotides, wherein "at least 8" is defined as any integer 
between 8 and the integer representing the 3' most nucleotide position as set forth in the sequence 
listing or elsewhere herein. Further included as preferred polynucleotides of the present invention 
are polynucleotide fragments at least 8 nucleotides in length, as described above, that are further 

20 specified in terms of their 5' and 3' position. The 5' and 3' positions are represented by the position 
numbers set forth in the appended sequence listing. For allelic, degenerate and other variants, 
position 1 is defined as the 5' most nucleotide of the ORF, i.e., the nucleotide "A" of the start codon 
with the remaining nucleotides numbered consecutively. Therefore, every combination of a 5* and 3 f 
nucleotide position that a polynucleotide fragment of the present invention, at least 8 contiguous 

25 nucleotides in length, could occupy on a polynucleotide of the invention is included in the invention 
as an individual species. The polynucleotide fragments specified by 5' and 3' positions can be 
immediately envisaged and are therefore not individually listed solely for the purpose of not 
unnecessarily lengthening the specifications. 

It is noted that the above species of polynucleotide fragments of the present invention may 

30 alternatively be described by the formula "a to b"; where "a" equals the 5' most nucleotide position 
and "b" equals the V most nucleotide position of the polynucleotide; and further where "a" equals 
an integer between 1 and the number of nucleotides of the polynucleotide sequence of the present 
invention minus 8, and where "b" equals an integer between 9 and the number of nucleotides of the 
polynucleotide sequence of the present invention; and where "a" is an integer smaller then "b" by at 

35 least 8. 

The present invention also provides for the exclusion of any species of polynucleotide 
fragments of the present invention specified by 5' and 3' positions or sub-genuses of 

40 



0142451A2I _> 



WO 01/42451 PCT71B00/01938 

polynucleotides specified by size in nucleotides as described above. Any number of fragments 
specified by 5' and 3' positions or by size in nucleotides, as described above, may be excluded. 
Specifically excluded from the invention are the fragments described in Table IV. For these cDNAs 
referred to by their sequence identification numbers, Table IV gives the positions of excluded 
5 fragments within these sequences fragments having substantial homology to polyadenylation tails 
and to repeated sequences including Alu, LI, THE and MER repeats, SSTR sequences or satellite, 
micro-satellite, and telomeric repeats. Each fragment is represented by a-b where a and b are the start 
and end positions respectively of a given excluded fragment. Excluded fragments are separated 
from each other by a coma. As used herein the term " polynucleotide described in Table IV " refers 

10 to all polynucleotide fragments defined in Table IV in this manner. 

Preferred included and excluded polynucleotide fragments of the invention are also 
described in Tables Va and Table Vb. For these cDNAs referred to by their sequence identification 
numbers, Tables Va and Table Vb give the positions of preferred fragments within these sequences 
(columns entitled "Preferentially included fragments") as well as the positions of preferentially 

15 excluded fragments (columns entitled "Preferentially excluded fragments"). Each fragment is 
represented by a-b where a and b are the start and end positions respectively of a given preferred 
fragment. Fragments are separated from each other by a coma. As used herein the term " excluded 
polynucleotide described in Tables Va and Vb " refers to all polynucleotide preferentially excluded 
as described in Tables Va and Vb. As used herein the term " preferred polynucleotide described in 

20 Tables Va and Vb " refers to all preferrentially included polynucleotide fragments listed in Tables 
Va and Table Vb in this manner. 

Therefore, the present invention encompasses isolated, purified, or recombinant 
polynucleotides which consist of, consist essentially of, or comprise a contiguous span of at least 8, 
10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 1000 or 2000 nucleotides 

25 of a sequence selected from the group consisting of the sequences of SEQ ID Nos: 1-241 and 
sequences fully complementary thereto, to the extent that a contiguous span of these lengths is 
consistent with the lengths of said selected sequence, wherein said contiguous span comprises at 
least 1, 2, 3, 5, 10, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400 or 500 nucleotides 
of a preferred polynucleotide described in Tables Va and Vb, or a sequence complementary thereto. 

30 The present invention also encompasses isolated, purified, or recombinant polynucleotides 

comprising, consisting essentially of, or consisting of a contiguous span of at least 8, 10, 12, 15, 18, 
20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 1000 or 2000 nucleotides of a 
polynucleotide selected from the group consisting of the sequences of SEQ ID Nos: 1-241 and 
sequences fully complementary thereto, wherein said contiguous span comprises a preferred 

35 polynucleotide described in Tables Va and Vb, or a sequence complementary thereto, to the extent 
that a contiguous span of these lengths is consistent with the length of the selected sequence. The 
present invention also encompasses isolated, purified, or recombinant nucleic acids which comprise, 
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consist of or consist essentially of a contiguous span of a polynucleotide selected from the group 
consisting of the sequences of SEQ ED Nos: 1-241 and sequences fully complementary thereto, 
wherein said contiguous span comprises preferred polynucleotide described in Tables Va and Vb, or 
a sequence complementary thereto. 
5 Other preferred fragments of the invention are polynucleotides comprising polynucleotides 

encoding domains of polypeptides. Such fragments may be used to obtain other polynucleotides 
encoding polypeptides having similar domains using hybridization or RT-PCR techniques. 
Alternatively, these fragments may be used to express a polypeptide domain which may present a 
specific biological property. Preferred domains for the GENSET polypeptides of the invention are 

10 described in Table VI. Thus, another object of the invention is an isolated, purified or recombinant 
polynucleotide encoding a polypeptide consisting of, consisting essentially of, or comprising a 
contiguous span of at least 5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, 200, 250, 300, 
350, 400, 450 or 500 consecutive amino acids of a sequence selected from the group consisting of 
the sequences of SEQ ID Nos: 242-482, to the extent that a contiguous span of these lengths is 

15 consistent with the lengths of said selected sequence, where said contiguous span comprises at least 
1, 2, 3, 5, or 10 of the amino acid positions of a domain of said selected sequence. The present 
invention also encompasses isolated, purified or recombinant polynucleotides encoding a 
polypeptide comprising a contiguous span of at least 5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 
75, 100, 150, 200, 250, 300, 350, 400, 450 or 500 consecutive amino acids of a sequence selected 

20 from the group consisting of sequences of SEQ ID Nos: 242-482, to the extent that a contiguous 
span of these lengths is consistent with the lengths of said selected sequence, where said contiguous 
span is a domain of said selected sequence. The present invention also encompasses isolated, 
purified or recombinant polynucleotides encoding a polypeptide comprising a domain of a sequence 
selected from the group consisting of the sequences of SEQ ID Nos: 242-482. 

25 The present invention further encompasses any combination of the polynucleotide 

fragments listed in this section. 

Oligonucleotide primers and probes 

The present invention also encompasses fragments of GENSET polynucleotides for use as 
primers and probes. Polynucleotides derived from the GENSET genomic and cDNA sequences are 
30 useful in order to detect the presence of at least a copy of a GENSET polynucleotide or fragment, 
complement, or variant thereof in a test sample. 

Structural definition 

Any polynucleotide of the invention may be used as a primer or probe. Particularly 
preferred probes and primers of the invention include isolated, purified, or recombinant 
35 polynucleotides comprising a contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 
80, 90, 100, 150, 200, 500, 1000, 1500 or 2000 nucleotides of a sequence selected from the group 
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consisting of the GENSET genomic sequences, the cDNA sequences and the sequences fully 
complementary thereto. Another object of the invention is a purified, isolated, or recombinant 
polynucleotide comprising the nucleotide sequence of a sequence selected from the group 
consisting of the sequences of SEQ ID Nos: 1 -24 1 , sequences of clone inserts of the deposited clone 
5 pool, sequences fully complementary thereto, allelic variants thereof, and fragments thereof. 

Moreover, preferred probes and primers of the invention include purified, isolated, or recombinant 
GENSET cDNAs consisting of, consisting essentially of, or comprising the sequences of SEQ ID 
Nos: 1-241 and sequences of clone inserts of the deposited clone pool. Particularly preferred probes 
and primers of the invention include isolated, purified, or recombinant polynucleotides comprising a 
10 contiguous span of at least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, 1000, 1500 or 
2000 nucleotides of a sequence selected from the group consisting of the sequences of SEQ ID Nos: 1-241 
and the sequences fully complementary thereto. 

Design of primers and probes 

A probe or a primer according to the invention has between 8 and 1000 nucleotides in 

15 length, or is specified to be at least 12, 15, 18, 20, 25, 35, 40, 50, 60, 70, 80, 100, 250, 500, 1000, 
1500 or 2000 nucleotides in length. More particularly, the length of these probes and primers can 
range from 8, 10, 15, 20, or 30 to 100 nucleotides, preferably from 10 to 50, more preferably from 
15 to 30 nucleotides. Shorter probes and primers tend to lack specificity for a target nucleic acid 
sequence and generally require cooler temperatures to form sufficiently stable hybrid complexes 

20 with the template. Longer probes and primers are expensive to produce and can sometimes self- 
hybridize to form hairpin structures. The appropriate length for primers and probes under a 
particular set of assay conditions may be empirically determined by one of skill in the art. The 
formation of stable hybrids depends on the melting temperature (Tm) of the DNA. The Tm depends 
on the length of the primer or probe, the ionic strength of the solution and the G+C content. The 

25 higher the G+C content of the primer or probe, the higher is the melting temperature because G:C 
pairs are held by three H bonds whereas A:T pairs have only two. The GC content in the probes of 
the invention usually ranges between 10 and 75 %, preferably between 35 and 60 %, and more 
preferably between 40 and 55 %. 

For amplification purposes, pairs of primers with approximately the same Tm are 

30 preferable. Primers may be designed using the OSP software (Hillier and Green, 1991), the 

disclosure of which is incorporated by reference in its entirety, based on GC content and melting 
temperatures of oligonucleotides, or using PC-Rare (http:// 

bioinformatics.weizmann.ac.il/software/PC-Rare/doc/manuel.html) based on. the octamer frequency 
disparity method (Griffais et at., 1991), the disclosure of which is incorporated by reference in its 
35 entirety. DNA amplification techniques are well known to those skilled in the art. Amplification 
techniques that can be used in the context of the present invention include, but are not limited to, the 
ligase chain reaction (LCR) described in EP-A- 320 308, WO 9320227 and EP-A-439 182, the 
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polymerase chain reaction (PCR, RT-PCR) and techniques such as the nucleic acid sequence based 
amplification (NASBA) described in Guatelli et a/.(1990) and in Compton (1991), Q-beta 
amplification as described in European Patent Application No 4544610, strand displacement 
amplification as described in Walker et al. (1996) and EP A 684 315 and, target mediated 
5 amplification as described in PCT Publication WO 9322461, the disclosures of which are 
incorporated by reference in their entireties. 

LCR and Gap LCR are exponential amplification techniques, both depend on DNA ligase to 
join adjacent primers annealed to a DNA molecule. In Ligase Chain Reaction (LCR), probe pairs 
are used which include two primary (first and second) and two secondary (third and fourth) probes, 

10 all of which are employed in molar excess to target. The first probe hybridizes to a first segment of 
the target strand and the second probe hybridizes to a second segment of the target strand, the first 
and second segments being contiguous so that the primary probes abut one another in 5' phosphate- 
3'hydroxyl relationship, and so that a ligase can covalently fuse or ligate the two probes into a fused 
product. In addition, a third (secondary) probe can hybridize to a portion of the first probe and a 

15 fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting 
fashion. Of course, if the target is initially double stranded, the secondary probes also will 
hybridize to the target complement in the first instance. Once the ligated strand of primary probes 
is separated from the target strand, it will hybridize with the third and fourth probes, which can be 
ligated to form a complementary, secondary ligated product. It is important to realize that the 

20 ligated products are functionally equivalent to either the target or its complement. By repeated 

cycles of hybridization and ligation, amplification of the target sequence is achieved. A method for 
multiplex LCR has also been described (WO 9320227), the disclosure of which is incorporated by 
reference in its entirety. Gap LCR (GLCR) is a version of LCR where the probes are not adjacent 
but are separated by 2 to 3 bases. 

25 For amplification of mRNAs, it is within the scope of the present invention to reverse 

transcribe mRNA into cDNA followed by polymerase chain reaction (RT-PCR); or, to use a single 
enzyme for both steps as described in U.S. Patent No. 5,322,770 or, to use Asymmetric Gap LCR 
(RT-AGLCR) as described by Marshall et a/.(1994), the disclosures of which are incorporated by 
reference in its entireties. AGLCR is a modification of GLCR that allows the amplification of 

30 RNA. 

The PCR technology is the preferred amplification technique used in the present invention. 
A variety of PCR techniques are familiar to those skilled in the art. For a review of PCR 
technology, see White (1997), Erlich (1992) and the publication entitled "PCR Methods and 
Applications" (1991, Cold Spring Harbor Laboratory Press), the disclosures of which are 
35 incorporated by reference in its entireties. In each of these PCR procedures, PCR primers on either 
side of the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid 
sample along with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, 
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Tth polymerase or Vent polymerase. The nucleic acid in the sample is denatured and the PCR 
primers are specifically hybridized to complementary nucleic acid sequences in the sample. The 
hybridized primers are extended. Thereafter, another cycle of denaturation, hybridization, and 
extension is initiated. The cycles are repeated multiple times to produce an amplified fragment 
5 containing the nucleic acid sequence between the primer sites. PCR has further been described in 
several patents including US Patents 4,683,195; 4,683,202; and 4,965,188, the disclosures of which 
are incorporated herein by reference in their entireties. 

Preparation of primers and probes 

The primers and probes can be prepared by any suitable method, including, for example, 
10 cloning and restriction of appropriate sequences and direct chemical synthesis by a method such as 
the phosphodiester method of Narang et #/.(1979), the phosphodiester method of Brown et 
a/.(1979), the diethylphosphoramidite method of Beaucage et a/.(1981) and the solid support 
method described in EP 0 707 592, which disclosures are hereby incorporated by reference in their 
entireties. 

15 Detection probes are generally nucleic acid sequences or uncharged nucleic acid analogs 

such as, for example peptide nucleic acids which are disclosed in International Patent Application 
WO 92/20702, morpholino analogs which are described in U.S. Patents Numbered 5,185,444; 
5,034,506 and 5,142,047, which disclosures are hereby incorporated by reference in their entireties. 
The probe may have to be rendered "non-extendable" in that additional dNTPs cannot be added to 

20 the probe. In and of themselves analogs usually are non-extendable and nucleic acid probes can be 
rendered non-extendable by modifying the 3 1 end of the probe such that the hydroxyl group is no 
longer capable of participating in elongation. For example, the 3' end of the probe can be 
functionalized with the capture or detection label to thereby consume or otherwise block the 
hydroxyl group. Alternatively, the 3 1 hydroxyl group simply can be cleaved, replaced or modified, 

25 U.S. Patent Application Serial No. 07/049,061 filed April 19, 1993, which disclosure is hereby 
incorporated by reference in its entirety, describes modifications, which can be used to render a 
probe non-extendable. 

Labeling of probes 

Any of the polynucleotides of the present invention can be labeled, if desired, by 
30 incorporating any label known in the art to be detectable by spectroscopic, photochemical, 

biochemical, immunochemical, or chemical means. For example, useful labels include radioactive 
substances (including, 32 P, 35 S, 3 H, ,25 I), fluorescent dyes (including, 5-bromodesoxyuridin, 
fluorescein, acetylaminofluorene, digoxigenin) or biotin. Preferably, polynucleotides are labeled at 
their 3' and 5 5 ends. Examples of non-radioactive labeling of nucleic acid fragments are described 
35 in the French patent No. FR-78 10975 or by Urdea et al (1988) or Sanchez-Pescador et al (1988), 
which disclosures are hereby incorporated by reference in their entireties. In addition, the probes 
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according to the present invention may have structural characteristics such that they allow the signal 
amplification, such structural characteristics being, for example, branched DNA probes as those 
described by Urdea et al. in 1991 or in the European patent No. EP 0 225 807 (Chiron), which 
disclosures are hereby incorporated by reference in their entireties. 
5 The detectable probe may be single stranded or double stranded and may be made using 

techniques known in the art, including in vitro transcription, nick translation, or kinase reactions. A 
nucleic acid sample containing a sequence capable of hybridizing to the labeled probe is contacted 
with the labeled probe. If the nucleic acid in the sample is double stranded, it may be denatured 
prior to contacting the probe. In some applications, the nucleic acid sample may be immobilized on 
10 a surface such as a nitrocellulose or nylon membrane. The nucleic acid sample may comprise 

nucleic acids obtained from a variety of sources, including genomic DNA, cDNA libraries, RNA, or 
tissue samples. 

Procedures used to detect the presence of nucleic acids capable of hybridizing to the 
detectable probe include well known techniques such as Southern blotting, Northern blotting, dot 

15 blotting, colony hybridization, and plaque hybridization. In some applications, the nucleic acid 
capable of hybridizing to the labeled probe may be cloned into vectors such as expression vectors, 
sequencing vectors, or in vitro transcription vectors to facilitate the characterization and expression 
of the hybridizing nucleic acids in the sample. For example, such techniques may be used to isolate 
and clone sequences in a genomic library or cDNA library which are capable of hybridizing to the 

20 detectable probe as described herein. 

Immobilization of probes 

A label can also be used to capture the primer, so as to facilitate the immobilization of 
either the primer or a primer extension product, such as amplified DNA, on a solid support. A 
capture label is attached to the primers or probes and can be a specific binding member which forms 

25 a binding pair with the solid's phase reagent's specific binding member (e.g. biotin and 

strep tavi din). Therefore depending upon the type of label carried by a polynucleotide or a probe, it 
may be employed to capture or to detect the target DNA. Further, it will be understood that the 
polynucleotides, primers or probes provided herein, may, themselves, serve as the capture label. 
For example, in the case where a solid phase reagent's binding member is a nucleic acid sequence, 

30 it may be selected such that it binds a complementary portion of a primer or probe to thereby 
immobilize the primer or probe to the solid phase. In cases where a polynucleotide probe itself 
serves as the binding member, those skilled in the art will recognize that the probe will contain a 
sequence or "tail" that is not complementary to the target. In the case where a polynucleotide 
primer itself serves as the capture label, at least a portion of the primer will be free to hybridize with 

35 a nucleic acid on a solid phase. DNA Labeling techniques are well known to the skilled technician. 
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The probes of the present invention are useful for a number of purposes. They can be 
notably used in Southern hybridization to genomic DNA. The probes can also be used to detect 
PCR amplification products. They may also be used to detect mismatches in the GENSET gene or 
mRNA using other techniques. 
5 Any of the polynucleotides, primers and probes of the present invention can be 

conveniently immobilized on a solid support. The solid support is not critical and can be selected by 
one skilled in the art. Thus, latex particles, microparticles, magnetic beads, non-magnetic beads 
(including polystyrene beads), membranes (including nitrocellulose strips), plastic tubes, walls of 
microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red blood cells and 

10 duracytes are all suitable examples. Suitable methods for immobilizing nucleic acids on solid 
phases include ionic, hydrophobic, covalent interactions and the like. A solid support, as used 
herein, refers to any material which is insoluble, or can be made insoluble by a subsequent reaction. 
The solid support can be chosen for its intrinsic ability to attract and immobilize the capture reagent. 
Alternatively, the solid phase can retain an additional receptor which has the ability to attract and 

15 immobilize the capture reagent. The additional receptor can include a charged substance that is 
oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to 
the capture reagent. As yet another alternative, the receptor molecule can be any specific binding 
member which is immobilized upon (attached to) the solid support and which has the ability to 
immobilize the capture reagent through a specific binding reaction. The receptor molecule enables 

20 the indirect binding of the capture reagent to a solid support material before the performance of the 
assay or during the performance of the assay. The solid phase thus can be a plastic, derivatized 
plastic, magnetic or non-magnetic metal, glass or silicon surface of a test tube, microtiter well, 
sheet, bead, microparticle, chip, sheep (or other suitable animal's) red blood cells, duracytes® and 
other configurations known to those of ordinary skill in the art. The polynucleotides of the 

25 invention can be attached to or immobilized on a solid support individually or in groups of at least 
2, 5, 8, 10, 12, 1 5, 20, or 25 distinct polynucleotides of the invention to a single solid support. In 
addition, polynucleotides other than those of the invention may be attached to the same solid 
support as one or more polynucleotides of the invention. 

Oligonucleotide array 

30 A substrate comprising a plurality of oligonucleotide primers or probes of the invention 

may be used either for detecting or amplifying targeted sequences in GENSET genes, may also be 
used for detecting mutations in the coding or in the non-coding sequences of GENSET genes, and 
may also be used to determine GENSET gene expression in different contexts such as in different 
tissues, at different stages of a process (embryo development, disease treatment), and in patients 

35 versus healthy individuals as described elsewhere in the application. 
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As used herein, the term " array " means a one dimensional, two dimensional, or 
multidimensional arrangement of nucleic acids of sufficient length to permit specific detection of 
gene expression. For example, the array may contain a plurality of nucleic acids derived from 
genes whose expression levels are to be assessed. The array may include a GENSET genomic 
5 DNA, a GENSET cDNA, sequences complementary thereto or fragments thereof. Preferably, the 
fragments are at least 12, 15, 1 8, 20, 25, 30, 35, 40 or 50 nucleotides in length. More preferably, the 
fragments are at least 100 nucleotides in length. Even more preferably, the fragments are more than 
100 nucleotides in length. In some embodiments the fragments may be more than 500 nucleotides 
in length. 

10 Any polynucleotide provided herein may be attached in overlapping areas or at random 

locations on the solid support. Alternatively the polynucleotides of the invention may be attached 
in an ordered array wherein each polynucleotide is attached to a distinct region of the solid support 
which does not overlap with the attachment site of any other polynucleotide. Preferably, such an 
ordered array of polynucleotides is designed to be "addressable" where the distinct locations are 

1 5 recorded and can be accessed as part of an assay procedure. Addressable polynucleotide arrays 
typically comprise a plurality of different oligonucleotide probes that are coupled to a surface of a 
substrate in different known locations. The knowledge of the precise location of each 
polynucleotides location makes these "addressable" arrays particularly useful in hybridization 
assays. Any addressable array technology known in the art can be employed with the 

20 polynucleotides of the invention. One particular embodiment of these polynucleotide arrays is 
known as the Genechips™, and has been generally described in US Patent 5,143,854; PCT 
publications WO 90/15070 and 92/10092, which disclosures are hereby incorporated by reference in 
their entireties. These arrays may generally be produced using mechanical synthesis methods or 
light directed synthesis methods which incorporate a combination of photolithographic methods and 

25 solid phase oligonucleotide synthesis (Fodor et al., 1991), which disclosure is hereby incorporated 
by reference in its entirety. The immobilization of arrays of oligonucleotides on solid supports has 
been rendered possible by the development of a technology generally identified as "Very Large 
Scale Immobilized Polymer Synthesis" (VLSIPS™) in which, typically, probes are immobilized in 
a high density array on a solid surface of a chip. Examples of VLSIPS™ technologies are provided 

30 in US Patents 5,143,854; and 5,412,087 and in PCT Publications WO 90/15070, WO 92/10092 and 
WO 95/1 1995, which disclosures are hereby incorporated by reference in their entireties, which 
describe methods for forming oligonucleotide arrays through techniques such as light-directed 
synthesis techniques. In designing strategies aimed at providing arrays of nucleotides immobilized 
on solid supports, further presentation strategies were developed to order and display the 

35 oligonucleotide arrays on the chips in an attempt to maximize hybridization patterns and sequence 
information. Examples of such presentation strategies are disclosed in PCT Publications WO 
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94/12305, WO 94/1 1530, WO 97/29212 and WO 97/31256, the disclosures of which are 
incorporated herein by reference in their entireties. 

Consequently, the invention concerns an array of nucleic acid molecules comprising at least 
one polynucleotide of the invention, particularly a probe or primer as described herein. Preferably, 
5 the invention concerns an array of nucleic acid comprising at least two polynucleotides of the 

invention, particularly probes or primers as described herein. Preferably, the invention concerns an 
array of nucleic acid comprising at least five polynucleotides of the invention, particularly probes or 
primers as described herein. 

A preferred embodiment of the present invention is an array of polynucleotides of at least 
10 12, 15, 18, 20, 25, 30, 35, 40, 50, 100, 500, 1000, 1500 or 2000 nucleotides in length which includes 
at least 1, 2, 5, 10, 15, 20, 35, 50, 100, 150 or 200 sequences selected from the group consisting of 
the sequences of SEQ ID Nos: 1-241 and sequences of clone inserts of the deposited clone pool, 
sequences fully complementary thereto, and fragments thereof. 

Methods of making the polynucleotides of the invention 

15 The present invention also comprises methods of making the polynucleotides of the 

invention, including the polynucleotides of SEQ ID Nos: 1-241 , genomic DNA obtainable 
therefrom, or fragment thereof. These methods comprise sequentially linking together nucleotides 
to produce the nucleic acids having the preceding sequences. Polynucleotides of the invention may 
be synthesized either enzymatically using techniques well known to those skilled in the art 

20 including amplification or hybridization-based methods as described herein, or chemically. 

A variety of chemical methods of synthesizing nucleic acids are known to those skilled in 
the art. In many of these methods, synthesis is conducted on a solid support. These included the 3' 
phosphoramidite methods in which the 3' terminal base of the desired oligonucleotide is 
immobilized on an insoluble carrier. The nucleotide base to be added is blocked at the 5' hydroxyl 

25 and activated at the 3' hydroxyl so as to cause coupling with the immobilized nucleotide base. 
Deblocking of the new immobilized nucleotide compound and repetition of the cycle will produce 
the desired polynucleotide. Alternatively, polynucleotides may be prepared as described in U.S. 
Patent No. 5,049,656, which disclosure is hereby incorporated by reference in its entirety. In some 
embodiments, several polynucleotides prepared as described above are ligated together to generate 

30 longer polynucleotides having a desired sequence. 

Pol ypeptides of the invention 

The term "GENSET polypeptides" is used herein to embrace all of the proteins and 

polypeptides of the present invention. The present invention encompasses GENSET polypeptides, 

including recombinant, isolated or purified GENSET polypeptides consisting of, consisting 

35 essentially of, or comprising a sequence selected from the group consisting of SEQ ED Nos: 242- 

482, the polypeptides encoded by human cDNAs contained in the deposited clones, the mature 

49 

BNSDOCID: <WO 0142451 A2_L> 



WO 01/42451 PCT/IB00/01938 
proteins included in SEQ ID Nos: 242-272 and 274-384, mature proteins encoded by the clone 
inserts of the deposited clone pool, and variants thereof. Other objects of the invention are 
polypeptides encoded by the polynucleotides of the invention as well as fusion polypeptides 
comprising such polypeptide. 

5 Polypeptide variants 

The present invention further provides for GENSET polypeptides encoded by allelic and 
splice variants, orthologs, and/or species homologues. Procedures known in the art can be used to 
obtain, allelic variants, splice variants, orthologs, and/or species homologues of polynucleotides 
encoding by polypeptides of the group consisting of SEQ ID Nos: 242-482, mature proteins 

10 included in SEQ ID Nos: 242-272 and 274-384, and polypeptides either fill-length or mature 
encoded by the clone inserts of the deposited clone pool, using information from the sequences 
disclosed herein or the clones deposited with the ATCC. 

The polypeptides of the present invention also include polypeptides having an amino acid 
sequence at least 50% identical, more preferably at least 60% identical, and still more preferably 

15 70%, 80%, 90%, 95%, 96%, 97%, 98% or 99% identical to a polypeptide selected from the group 
consisting of the sequences of SEQ ID Nos: 242-482, mature proteins included in sequences of SEQ 
ID Nos: 242-272 and 274-384, and full-length or mature polypeptides encoded by the clone inserts 
of the deposited clone pool. By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to a query amino acid sequence of the present invention, it is intended that the 

20 amino acid sequence of the subject polypeptide is identical to the query sequence except that the 
subject polypeptide sequence may include up to five amino acid alterations per each 100 amino 
acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino 
acid sequence at least 95% identical to a query amino acid sequence, up to 5% (5 of 100) of the 
amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with 

25 another amino acid. 

Further polypeptides of the present invention include polypeptides which have at least 90% 
similarity, more preferably at least 95% similarity, and still more preferably at least 96%, 97%, 98% 
or 99% similarity to those described above. By a polypeptide having an amino acid sequence at 
least, for example, 95% "similar" to a query amino acid sequence of the present invention, it is 

30 intended that the amino acid sequence of the subject polypeptide is similar (i.e. contain identical or 
equivalent amino acid residues) to the query sequence except that the subject polypeptide sequence 
may include up to five amino acid alterations per each 100 amino acids of the query amino acid 
sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% 
similar to a query amino acid sequence, up to 5% (5 of 100) of the amino acid residues in the 

35 subject sequence may be inserted, deleted, (indels) or substituted with another non-equivalent 
amino acid. 

50 



01 42451 A2 I > 



WO 01/42451 PCT/IB00/01938 
These alterations of the reference sequence may occur at the amino or carboxy terminal 
positions of the reference amino acid sequence or anywhere between those terminal positions, 
interspersed either individually among residues in the reference sequence or in one or more 
contiguous groups within the reference sequence. The query sequence may be an entire amino acid 
5 sequence selected from the group consisting of sequences of SEQ ID Nos: 242-482 and those 
encoded by the clone inserts of the deposited clone pool or any fragment specified as described 
herein. 

The variant polypeptides described herein are included in the present invention regardless 
of whether they have their normal biological activity. This is because even where a particular 

10 polypeptide molecule does not have biological activity, one of skill in the art would still know how 
to use the polypeptide, for instance, as a vaccine or to generate antibodies. Other uses of the 
polypeptides of the present invention that do not have GENSET biological activity include, inter 
alia, as epitope tags, in epitope mapping, and as molecular weight markers on SDS-PAGE gels or 
on molecular sieve gel filtration columns using methods known to those of skill in the art. As 

15 described below, the polypeptides of the present invention can also be used to raise polyclonal and 
monoclonal antibodies, which are useful in assays for detecting GENSET protein expression or as 
agonists and antagonists capable of enhancing or inhibiting GENSET protein function. Further, 
such polypeptides can be used in the yeast two-hybrid system to "capture" GENSET protein binding 
proteins, which are also candidate agonists and antagonists according to the present invention {See, 

20 e.g., Fields et al. 1989), which disclosure is hereby incorporated by reference in its entirety. 

Preparation of the polypeptides of the invention 

The polypeptides of the present invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinantly produced 
polypeptides, synthetically produced polypeptides, or polypeptides produced by a combination of 

25 these methods. The polypeptides of the present invention are preferably provided in an isolated 
form, and may be partially or preferably substantially purified. 

Consequently, the present invention also comprises methods of making the polypeptides of 
the invention, particularly polypeptides encoded by the cDNAs of SEQ ID Nos: 1-241, mature 
proteins encoded by fragments of SEQS ID Nos: 1-31 and 33-143, full-length and mature 

30 polypeptides encoded by the clone inserts of the deposited clone pool, genomic DNA obtainable 
therefrom, or fragments thereof and methods of making the polypeptides of SEQ ID Nos: 242-482, 
mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, or fragments thereof. The 
methods comprise sequentially linking together amino acids to produce the nucleic polypeptides 
having the preceding sequences. In some embodiments, the polypeptides made by these methods 

35 are 150 amino acids or less in length. In other embodiments, the polypeptides made by these 
methods are 120 amino acids or less in length. 
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From natural sources 

The GENSET proteins of the invention may be isolated from natural sources, including 
bodily fluids, tissues and cells, whether directly isolated or cultured cells, of humans or non-human 
5 animals. Methods for extracting and purifying natural proteins are known in the art, and include the 
use of detergents or chaotropic agents to disrupt particles followed by differential extraction and 
separation of the polypeptides by ion exchange chromatography, affinity chromatography, 
sedimentation according to density, and gel electrophoresis. See, for example, "Methods in 
Enzymology, Academic Press, 1993" for a variety of methods for purifying proteins, which 
10 disclosure is hereby incorporated by reference in its entirety. Polypeptides of the invention also can 
be purified from natural sources using antibodies directed against the polypeptides of the invention, 
such as those described herein, in methods which are well known in the art of protein purification. 

From recombinant sources 

Preferably, the GENSET polypeptides of the invention are recombinantly produced using 

15 routine expression methods known in the art. The polynucleotide encoding the desired polypeptide 
is operably linked to a promoter into an expression vector suitable for any convenient host. Both 
eukaryotic and prokaryotic host systems are used in forming recombinant polypeptides. The 
polypeptide is then isolated from lysed cells or from the culture medium and purified to the extent 
needed for its intended use. 

20 Any GENSET polynucleotide, including those described in SEQ ID Nos: 1-241, those of 

clone inserts of the deposited clone pool, and allelic variants thereof may be used to express 
GENSET polypeptides. The nucleic acid encoding the GENSET polypeptide to be expressed is 
operably linked to a promoter in an expression vector using conventional cloning technology. The 
GENSET insert in the expression vector may comprise the full coding sequence for the GENSET 

25 protein or a portion thereof, especially the sequence for a mture polypeptide. For example, the 
GENSET derived insert may encode a polypeptide comprising at least 6, 8, 10, 12, 15, 20, 25, 30, 
35, 40, 50, 60, 75, 100, 150 or 200 consecutive amino acids of a GENSET protein selected from the 
group consisting of sequences of SEQ ID Nos: 242-482 and polypeptides encoded by the clone 
inserts of the deposited clone pool. 

30 Consequently, a further embodiment of the present invention is a method of making a 

polypeptide comprising a protein selected from the group consisting of sequences of SEQ ID Nos: 
242-482 and polypeptides encoded by the clone inserts of the deposited clone pool, said method 
comprising the steps of 

a) obtaining a cDNA comprising a sequence selected from the group consisting of i) the 

35 sequences SEQ ID Nos: 1-241, ii) the sequences of clone inserts of the deposited clone pool one, iii) 

sequences encoding one of the polypeptide of SEQ ID Nos: 242-482, and iv) sequences of 
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polynucleotides encoding a polypeptide which is encoded by one of the clone insert of the deposited 
clone pool; 

b) inserting said cDNA in an expression vector such that the cDNA is operably linked to a 
promoter; and 

5 c) introducing said expression vector into a host cell whereby said host cell produces said 

polypeptide. 

In one aspect of this embodiment, the method further comprises the step of isolating the 
polypeptide. Another embodiment of the present invention is a polypeptide obtainable by the 
method described in the preceding paragraph. 

10 The expression vector is any of the mammalian, yeast, insect or bacterial expression 

systems known in the art. Commercially available vectors and expression systems are available 
from a variety of suppliers including Genetics Institute (Cambridge, MA), Stratagene (La Jolla, 
California), Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to 
enhance expression and facilitate proper protein folding, the codon context and codon pairing of the 

1 5 sequence is optimized for the particular expression organism in which the expression vector is 
introduced, as explained in U.S. Patent No. 5,082,767, which disclosure is hereby incorporated by 
reference in its entirety. 

In one embodiment, the entire coding sequence of a GENSET cDNA and the 3'UTR 
through the poly A signal of the cDNA is operably linked to a promoter in the expression vector. 

20 Alternatively, if the nucleic acid encoding a portion of the GENSET protein lacks a methionine to 
serve as the initiation site, an initiating methionine can be introduced next to the first codon of the 
nucleic acid using conventional techniques. Similarly, if the insert from the GENSET cDNA lacks 
a poly A signal, this sequence can be added to the construct by, for example, splicing out the Poly A 
signal from pSG5 (Stratagene) using Bgll and Sail restriction endonuclease enzymes and 

25 incorporating it into the mammalian expression vector pXTl (Stratagene). pXTl contains the LTRs 
and a portion of the gag gene from Moloney Murine Leukemia Virus. The position of the LTRs in 
the construct allow efficient stable transfection. The vector includes the Herpes Simplex Thymidine 
Kinase promoter and the selectable neomycin gene. The nucleic acid encoding the GENSET 
protein or a portion thereof is obtained by PCR from a vector containing a GENSET cDNA selected 

30 from the group consisting of the sequences of SEQ ID Nos: 1-241 and the clone inserts of the 
deposited clone pool using oligonucleotide primers complementary to the GENSET cDNA or 
portion thereof and containing restriction endonuclease sequences for Pst I incorporated into the 5' 
primer and Bglll at the 5' end of the corresponding cDNA 3* primer, taking care to ensure that the 
sequence encoding the GENSET protein or a portion thereof is positioned properly with respect to 

35 the poly A signal. The purified fragment obtained from the resulting PCR reaction is digested with 
PstI, blunt ended with an exonuclease, digested with Bgl II, purified and ligated to pXTl, now 
containing a poly A signal and digested with Bglll. 
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Alternatively, cDNAs encoding secreted proteins may be cloned into pED6dpc2 
(DiscoverEase, Genetics Institute, Cambridge, MA). The resulting pED6dpc2 constructs may be 
transfected into a suitable host cell, such as COS 1 cells. Methotrexate resistant cells are selected 
and expanded. Preferably, the secreted protein expressed from the cDNA is released into the 
5 culture medium thereby facilitating purification. 

In another embodiment, it is often advantageous to add to the recombinant polynucleotide 
additional nucleotide sequence which codes for secretory or leader sequences, pro-sequences, 
sequences which aid in purification, such as multiple histidine residues, or an additional sequence 
for stability during recombinant production. 
10 As a control, the expression vector lacking a cDNA insert is introduced into host cells or 

organisms. 

Transfection of a GENSET expressing vector into mouse NTH 3T3 cells is but one 
embodiment of introducing polynucleotides into host cells. Introduction of a polynucleotide 
encoding a polypeptide into a host cell can be effected by calcium phosphate transfection, 

15 DEAE-dextran mediated transfection, cationic lipid-mediated transfection, electroporation, 

transduction, infection, or other methods. Such methods are described in many standard laboratory 
manuals, such as Davis et aL (1986), which disclosure is hereby incorporated by reference in its 
entirety. It is specifically contemplated that the polypeptides of the present invention may in fact be 
expressed by a host cell lacking a recombinant vector. 

20 Recombinant cell extracts, or proteins from the culture medium if the expressed polypeptide 

is secreted, are then prepared and proteins separated by gel electrophoresis. If desired, the proteins 
may be ammonium sulfate precipitated or separated based on size or charge prior to electrophoresis. 
The proteins present are detected using techniques such as Coomassie or silver staining or using 
antibodies against the protein encoded by the GENSET cDNA of interest. Coomassie and silver 

25 staining techniques are familiar to those skilled in the art. 

Proteins from the host cells or organisms containing an expression vector which contains 
the GENSET cDNA or a fragment thereof are compared to those from the control cells or organism. 
The presence of a band from the cells containing the expression vector which is absent in control 
cells indicates that the GENSET cDNA is expressed. Generally, the band corresponding to the 

30 protein encoded by the GENSET cDNA will have a mobility near that expected based on the 
number of amino acids in the open reading frame of the cDNA. However, the band may have a 
mobility different than that expected as a result of modifications such as glycosylation, 
ubiquitination, or enzymatic cleavage. 

Alternatively, the GENSET polypeptide to be expressed may also be a product of transgenic 

35 animals, i.e., as a component of the milk of transgenic cows, goats, pigs or sheeps which are 
characterized by somatic or germ cells containing a nucleotide sequence encoding the protein of 
interest. 
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A polypeptide of this invention can be recovered and purified from recombinant cell 
cultures by well-known methods including differential extraction, ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose 
chromatography, hydrophobic interaction chromatography, affinity chromatography, 
5 hydroxylapatite chromatography and lectin chromatography. See, for example, "Methods in 
Enzymology", supra for a variety of methods for purifying proteins. Most preferably, high 
performance liquid chromatography ("HPLC") is employed for purification. A recombinantly 
produced version of a GENSET polypeptide can be substantially purified using techniques 
described herein or otherwise known in the art, such as, for example, by the one-step method 
10 described in Smith and Johnson (1988), which disclosure is hereby incorporated by reference in its 
entirety. Polypeptides of the invention also can be purified from recombinant sources using 
antibodies directed against the polypeptides of the invention, such as those described herein, in 
methods which are well known in the art of protein purification. 

Preferably, the recombinantly expressed GENSET polypeptide is purified using standard 
15 immunochrornatography techniques such as the one described in the section entitled 

"Immunoaffinity Chromatography". In such procedures, a solution containing the protein of 
interest, such as the culture medium or a cell extract, is applied to a column having antibodies 
against the protein attached to the chromatography matrix. The recombinant protein is allowed to 
bind the immunochrornatography column. Thereafter, the column is washed to remove non- 
20 specifically bound proteins. The specifically bound protein is then released from the column and 
recovered using standard techniques. 

If antibody production is not possible, the GENSET cDNA sequence or fragment thereof 
may be incorporated into expression vectors designed for use in purification schemes employing 
chimeric polypeptides. In such strategies the coding sequence of the GENSET cDNA or fragment 
25 thereof is inserted in frame with the gene encoding the other half of the chimera. The other half of 
the chimera may be beta-globin or a nickel binding polypeptide encoding sequence. A 
chromatography matrix having antibody to beta-globin or nickel attached thereto is then used to 
purify the chimeric protein. Protease cleavage sites may be engineered between the beta-globin 
gene or the nickel binding polypeptide and the GENSET cDNA or fragment thereof Thus, the two 
30 polypeptides of the chimera may be separated from one another by protease digestion. 

One useful expression vector for generating beta-globin chimerics is pSG5 (Stratagene), 
which encodes rabbit beta-globin. Intron II of the rabbit beta-globin gene facilitates splicing of the 
expressed transcript, and the polyadenylation signal incorporated into the construct increases the 
level of expression. These techniques as described are well known to those skilled in the art of 
35 molecular biology. Standard methods are published in methods texts such as Davis et al 9 (1986) 
and many of the methods are available from Stratagene, Life Technologies, Inc., or Promega. 
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Polypeptide may additionally be produced from the construct using in vitro translation systems such 
as the In vitro Express™ Translation Kit (Stratagene). 

Depending upon the host employed in a recombinant production procedure, the 
polypeptides of the present invention may be glycosylated or may be non-glycosylated. In addition, 
5 polypeptides of the invention may also include an initial modified methionine residue, in some 
cases as a result of host-mediated processes. Thus, it is well known in the art that the N-terminal 
methionine encoded by the translation initiation codon generally is removed with high efficiency 
from any protein after translation in all eukaryotic cells. While the N-terminal methionine on most 
proteins also is efficiently removed in most prokaryotes, for some proteins, this prokaryotic removal 
10 process is inefficient, depending on the nature of the amino acid to which the N-terminal 
methionine is covalently linked. 

From chemical synthesis 

In addition, polypeptides of the invention, especially short protein fragments, can be 
chemically synthesized using techniques known in the art (See, e.g., Creighton, 1983; and 

1 5 Hunkapiller et aL y 1 984), which disclosures are hereby incorporated by reference in their entireties. 
For example, a polypeptide corresponding to a fragment of a polypeptide sequence of the invention 
can be synthesized by use of a peptide synthesizer. A variety of methods of making polypeptides 
are known to those skilled in the art, including methods in which the carboxyl terminal amino acid 
is bound to polyvinyl benzene or another suitable resin. The amino acid to be added possesses 

20 blocking groups on its amino moiety and any side chain reactive groups so that only its carboxyl 
moiety can react. The carboxyl group is activated with carbodiimide or another activating agent 
and allowed to couple to the immobilized amino acid. After removal of the blocking group, the 
cycle is repeated to generate a polypeptide having the desired sequence. Alternatively, the methods 
described in U.S. Patent No. 5,049,656, which disclosure is hereby incorporated by reference in its 

25 entirety, may be used. 

Furthermore, if desired, nonclassical amino acids or chemical amino acid analogs can be 
introduced as a substitution or addition into the polypeptide sequence. Non-classical amino acids 
include, but are not limited to, to the D-isomers of the common amino acids, 2,4-diaminobutyric 
acid, a-amino isobutyric acid, 4-aminobutyric acid, Abu, 2-amino butyric acid, g-Abu, e-Ahx, 6- 

30 amino hexanoic acid, Aib, 2-amino isobutyric acid, 3-amino propionic acid, ornithine, norleucine, 
norvaline, hydroxyproline, sarcosine, citrulline, homocitrulline, cysteic acid, t-butylglycine, t- 
butylalanine, phenylglycine, cyclohexylalanine, b-alanine, fluoroamino acids, designer amino acids 
such as b-methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, and amino acid 
analogs in general. Furthermore, the amino acid can be D (dextrorotary) or L (levorotary). 
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Modifications 

The invention encompasses polypeptides which are differentially modified during or after 
translation, e.g., by glycosylation, acetylation, phosphorylation, amidation, derivatization by known 
protecting/blocking groups, proteolytic cleavage, linkage to an antibody molecule or other cellular 
5 ligand, etc. Any of numerous chemical modifications may be carried out by known techniques, 
including but not limited, to specific chemical cleavage by cyanogen bromide, trypsin, 
chymotrypsin, papain, V8 protease, NaBH4; acetylation, formylation, oxidation, reduction; 
metabolic synthesis in the presence of tunicamycin; etc. 

Additional post-translational modifications encompassed by the invention include, for 

10 example, e.g., N-linked or O-linked carbohydrate chains, processing of N-terminal or C-terminal 
ends), attachment of chemical moieties to the amino acid backbone, chemical modifications of 
N-linked or O-linked carbohydrate chains, and addition or deletion of an N-terminal methionine 
residue as a result of prokaryotic host cell expression. The polypeptides may also be modified with 
a detectable label, such as an enzymatic, fluorescent, isotopic or affinity label to allow for detection 

15 and isolation of the protein. 

Also provided by the invention are chemically modified derivatives of the polypeptides of 
the invention which may provide additional advantages such as increased solubility, stability and 
circulating time of the polypeptide, or decreased immunogenicity. See U.S. Patent No: 4,179,337. 
The chemical moieties for derivatization may be selected See U.S. Patent NO: 4,179,337, which 

20 disclosure is hereby incorporated by reference in its entirety. The chemical moieties for 

derivatization may be selected from water soluble polymers such as polyethylene glycol, ethylene 
glycol/propylene glycol copolymers, carboxymethylcellulose, dextran, polyvinyl alcohol and the 
like. The polypeptides may be modified at random positions within the molecule, or at 
predetermined positions within the molecule and may include one, two, three or more attached 

25 chemical moieties. 

The polymer may be of any molecular weight, and may be branched or unbranched. For 
polyethylene glycol, the preferred molecular weight is between about 1 kDa and about 100 kDa (the 
term "about" indicating that in preparations of polyethylene glycol, some molecules will weigh 
more, some less, than the stated molecular weight) for ease in handling and manufacturing. Other 

30 sizes may be used, depending on the desired therapeutic profile (e.g., the duration of sustained 
release desired, the effects, if any on biological activity, the ease in handling, the degree or lack of 
antigenicity and other known effects of the polyethylene glycol to a therapeutic protein or analog). 

The polyethylene glycol molecules (or other chemical moieties) should be attached to the 
protein with consideration of effects on functional or antigenic domains of the protein. There are a 

35 number of attachment methods available to those skilled in the art, e.g., EP 0 401 384, (coupling 
PEG to G-CSF), and Malik et aL (1992) (reporting pegylation of GM-CSF using tresyl chloride), 
which disclosures are hereby incorporated by reference in their entireties. For example, 
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polyethylene glycol may be covalently bound through amino acid residues via a reactive group, 
such as, a free amino or carboxyl group. Reactive groups are those to which an activated 
polyethylene glycol molecule may be bound. The amino acid residues having a free amino group 
may include lysine residues and the N-terminal amino acid residues; those having a free carboxyl 
5 group may include aspartic acid residues glutamic acid residues and the C-terminal amino acid 
residue. Sulfhydryl groups may also be used as a reactive group for attaching the polyethylene 
glycol molecules. Preferred for therapeutic purposes is attachment at an amino group, such as 
attachment at the N-terminus or lysine group. 

One may specifically desire proteins chemically modified at the N-terminus. Using 

10 polyethylene glycol as an illustration of the present composition, one may select from a variety of 
polyethylene glycol molecules (by molecular weight, branching, etc.), the proportion of 
polyethylene glycol molecules to protein (polypeptide) molecules in the reaction mix, the type of 
pegylation reaction to be performed, and the method of obtaining the selected N-terminally 
pegylated protein. The method of obtaining the N-terminally pegylated preparation (i.e., separating 

1 5 this moiety from other monopegylated moieties if necessary) may be by purification of the 
N-terminally pegylated material from a population of pegylated protein molecules. Selective 
proteins chemically modified at the N-terminus modification may be accomplished by reductive 
alkylation, which exploits differential reactivity of different types of primary amino groups (lysine 
versus the N-terminal) available for derivatization in a particular protein. Under the appropriate 

20 reaction conditions, substantially selective derivatization of the protein at the N-terminus with a 
carbonyl group containing polymer is achieved. 

Multimerization 

The polypeptides of the invention may be in monomers or multimers (i.e., dimers, trimers, 
tetramers and higher multimers). Accordingly, the present invention relates to monomers and 

25 multimers of the polypeptides of the invention, their preparation, and compositions containing them. 
In specific embodiments, the polypeptides of the invention are monomers, dimers, trimers or 
tetramers. In additional embodiments, the multimers of the invention are at least dimers, at least 
trimers, or at least tetramers. 

Multimers encompassed by the invention may be homomers or heteromers. As used herein, 

30 the term " homomer" , refers to a multimer containing only polypeptides corresponding to the amino 
acid sequences of SEQ ID Nos: 242-482 or encoded by the clone inserts of the deposited clone pool 
(including fragments, variants, splice variants, and fusion proteins, corresponding to these 
polypeptides as described herein). These homomers may contain polypeptides having identical or 
different amino acid sequences. In a specific embodiment, a homomer of the invention is a multimer 

35 containing only polypeptides having an identical amino acid sequence. In another specific 

embodiment, a homomer of the invention is a multimer containing polypeptides having different 
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amino acid sequences. In specific embodiments, the multimer of the invention is a homodimer (e.g., 
containing polypeptides having identical or different amino acid sequences) or a homotrimer {e.g., 
containing polypeptides having identical and/or different amino acid sequences). In additional 
embodiments, the homomenc multimer of the invention is at least a homodimer, at least a 
5 homotrimer, or at least a homotetramer. 

As used herein, the term " heteromer " refers to a multimer containing one or more 
heterologous polypeptides (i.e., polypeptides of different proteins) in addition to the polypeptides of 
the invention. In a specific embodiment, the multimer of the invention is a heterodimer, a 
heterotrimer, or a heterotetramer. In additional embodiments, the heteromeric multimer of the 

10 invention is at least a heterodimer, at least a heterotrimer, or at least a heterotetramer. 

Multimers of the invention may be the result of hydrophobic, hydrophilic, ionic and/or 
covalent associations and/or may be indirectly linked, by for example, liposome formation. Thus, 
in one embodiment, multimers of the invention, such as, for example, homodimers or homotrimers, 
are formed when polypeptides of the invention contact one another in solution. In another 

15 embodiment, heteromul timers of the invention, such as, for example, heterotrimers or 
heterotetramers, are formed when polypeptides of the invention contact antibodies to the 
polypeptides of the invention (including antibodies to the heterologous polypeptide sequence in a 
fusion protein of the invention) in solution. In other embodiments, multimers of the invention are 
formed by covalent associations with and/or between the polypeptides of the invention. Such 

20 covalent associations may involve one or more amino acid residues contained in the polypeptide 
sequence ( e.g., that recited in the sequence listing, or contained in the polypeptide encoded by a 
deposited clone). In one instance, the covalent associations are cross-linking between cysteine 
residues located within the polypeptide sequences, which interact in the native (i.e., naturally 
occurring) polypeptide. In another instance, the covalent associations are the consequence of 

25 chemical or recombinant manipulation. Alternatively, such covalent associations may involve one 
or more amino acid residues contained in the heterologous polypeptide sequence in a fusion protein 
of the invention. 

In one example, covalent associations are between the heterologous sequence contained in a 
fusion protein of the invention (see, e.g., US Patent Number 5,478,925, which disclosure is hereby 

30 incorporated by reference in its entirety). In a specific example, the covalent associations are 

between the heterologous sequence contained in an Fc fusion protein of the invention (as described 
herein). In another specific example, covalent associations of fusion proteins of the invention are 
between heterologous polypeptide sequence from another protein that is capable of forming 
covalently associated multimers, such as for example, oseteoprotegerin (see, e.g., International 

35 Publication No: WO 98/49305, the contents of which are herein incorporated by reference in its 
entirety). In another embodiment, two or more polypeptides of the invention are joined through 
peptide linkers. Examples include those peptide linkers described in U.S. Pat. No. 5,073,627 
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(hereby incorporated by reference). Proteins comprising multiple polypeptides of the invention 
separated by peptide linkers may be produced using conventional recombinant DNA technology. 

Another method for preparing multimer polypeptides of the invention involves use of 
polypeptides of the invention fused to a leucine zipper or isoleucine zipper polypeptide sequence. 
5 Leucine zipper and isoleucine zipper domains are polypeptides that promote multimerization of the 
proteins in which they are found. Leucine zippers were originally identified in several 
DNA-binding proteins, and have since been found in a variety of different proteins (Landschulz et 
aL, 1988). Among the known leucine zippers are naturally occurring peptides and derivatives 
thereof that dimerize or trimerize. Examples of leucine zipper domains suitable for producing 

10 soluble multimeric proteins of the invention are those described in PCT application WO 94/10308, 
hereby incorporated by reference. Recombinant fusion proteins comprising a polypeptide of the 
invention fused to a polypeptide sequence that dimerizes or trimerizes in solution are expressed in 
suitable host cells, and the resulting soluble multimeric fusion protein is recovered from the culture 
supernatant using techniques known in the art. 

15 Trimeric polypeptides of the invention may offer the advantage of enhanced biological 

activity. Preferred leucine zipper moieties and isoleucine moieties are those that preferentially form 
trimers. One example is a leucine zipper derived from lung surfactant protein D (SPD), as 
described in Hoppe et al. (1994) and in U.S. patent application Ser. No. 08/446,922, which 
disclosure is hereby incorporated by reference in its entirety. Other peptides derived from naturally 

20 occurring trimeric proteins may be employed in preparing trimeric polypeptides of the invention. In 
another example, proteins of the invention are associated by interactions between Flag® 
polypeptide sequence contained in fusion proteins of the invention containing Flag® polypeptide 
sequence. In a further embodiment, associations proteins of the invention are associated by 
interactions between heterologous polypeptide sequence contained in Flag® fusion proteins of the 

25 invention and anti Flag® antibody. 

The multimers of the invention may be generated using chemical techniques known in the 
art. For example, polypeptides desired to be contained in the multimers of the invention may be 
chemically cross-linked using linker molecules and linker molecule length optimization techniques 
known in the art (see, e.g., US Patent Number 5,478,925, which is herein incorporated by 

30 reference in its entirety). Additionally, multimers of the invention may be generated using 

techniques known in the art to form one or more inter-molecule cross-links between the cysteine 
residues located within the sequence of the polypeptides desired to be contained in the multimer 
(see, e.g., US Patent Number 5,478,925, which is herein incorporated by reference in its entirety). 
Further, polypeptides of the invention may be routinely modified by the addition of cysteine or 

35 biotin to the C terminus or N-terminus of the polypeptide and techniques known in the art may be 
applied to generate multimers containing one or more of these modified polypeptides (see, e.g., US 
Patent Number 5,478,925, which is herein incorporated by reference in its entirety). Additionally, 
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30 techniques known in the art may be applied to generate liposomes containing the polypeptide 
components desired to be contained in the multimer of the invention (see, e.g., US Patent Number 
5,478,925, which is herein incorporated by reference in its entirety). 

Alternatively, multimers of the invention may be generated using genetic engineering 
5 techniques known in the art. In one embodiment, polypeptides contained in multimers of the 
invention are produced recombinantly using fusion protein technology described herein or 
otherwise known in the art (see, e.g., US Patent Number 5,478,925, which is herein incorporated by 
reference in its entirety). In a specific embodiment, polynucleotides coding for a homodimer of the 
invention are generated by ligating a polynucleotide sequence encoding a polypeptide of the 

10 invention to a sequence encoding a linker polypeptide and then further to a synthetic polynucleotide 
encoding the translated product of the polypeptide in the reverse orientation from the original 
C-terminus to the N-terminus (lacking the leader sequence) (see, e.g., US Patent Number 5,478,925, 
which is herein incorporated by reference in its entirety). In another embodiment, recombinant 
techniques described herein or otherwise known in the art are applied to generate recombinant 

1 5 polypeptides of the invention which contain a transmembrane domain (or hydrophobic or signal 
peptide) and which can be incorporated by membrane reconstitution techniques into liposomes (see, 
e.g., US Patent Number 5,478,925, which is herein incorporated by reference in its entirety). 

Mutated polypeptides 

To improve or alter the characteristics of GENSET polypeptides of the present invention, 
20 protein engineering may be employed. Recombinant DNA technology known to those skilled in the 
art can be used to create novel mutant proteins or muteins including single or multiple amino acid 
substitutions, deletions, additions, or fusion proteins. Such modified polypeptides can show, e.g., 
increased/decreased biological activity or increased/decreased stability. In addition, they may be 
purified in higher yields and show better solubility than the corresponding natural polypeptide, at 
25 least under certain purification and storage conditions. Further, the polypeptides of the present 

invention may be produced as multimers including dimers, trimers and tetramers. Multimerization 
may be facilitated by linkers or recombinantly though heterologous polypeptides such as Fc regions. 

N- and C-terminal deletions 

It is known in the art that one or more amino acids may be deleted from the N-terminus or 

30 C-terminus without substantial loss of biological function. For instance, Ron et al. (1993), reported 
modified KGF proteins that had heparin binding activity even if 3, 8, or 27 N-terminal amino acid 
residues were missing. Accordingly, the present invention provides polypeptides having one or 
more residues deleted from the amino terminus of the polypeptides of SEQ ID Nos: 242-482 or that 
encoded by the clone inserts of the deposited clone pool. Similarly, many examples of biologically 

35 functional C-terminal deletion mutants are known. For instance, Interferon gamma shows up to ten 
times higher activities by deleting 810 amino acid residues from the C-terminus of the protein (See, 
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e.g., Dobeli, et al. 1988), which disclosure is hereby incorporated by reference in its entirety. 
Accordingly, the present invention provides polypeptides having one or more residues deleted from 
the carboxy terminus of the polypeptides shown of SEQ ID Nos: 242-482 or encoded by the clone 
inserts of the deposited clone pool. The invention also provides polypeptides having one or more 
5 amino acids deleted from both the amino and the carboxyl termini as described below. 

Other mutations 

Other mutants in addition to N- and C -terminal deletion forms of the protein discussed 
above are included in the present invention. It also will be recognized by one of ordinary skill in 
the art that some amino acid sequences of the GENSET polypeptides of the present invention can be 

10 varied without significant effect of the structure or function of the protein. If such differences in 
sequence are contemplated, it should be remembered that there will be critical areas on the protein 
which determine activity. Thus, the invention further includes variations of the GENSET 
polypeptides which show substantial GENSET polypeptide activity. Such mutants include 
deletions, insertions, inversions, repeats, and substitutions selected according to general rules 

1 5 known in the art so as to have little effect on activity. For example, guidance concerning how to 
make phenotypically silent amino acid substitutions is provided. 

There are two main approaches for studying the tolerance of an amino acid sequence to 
change (See, Bowie et al. 1994), which disclosure is hereby incorporated by reference in its 
entirety. The first method relies on the process of evolution, in which mutations are either accepted 

20 or rejected by natural selection. 

The second approach uses genetic engineering to introduce amino acid changes at specific 
positions of a cloned gene and selections or screens to identify sequences that maintain 
functionality. These studies have revealed that proteins are surprisingly tolerant of amino acid 
substitutions. The studies indicate which amino acid changes are likely to be permissive at a certain 

25 position of the protein. For example, most buried amino acid residues require nonpolar side chains, 
whereas few features of surface side chains are generally conserved. Other such phenotypically 
silent substitutions are described by Bowie et al. (supra) and the references cited therein. 

Typically seen as conservative substitutions are the replacements, one for another, among 
the aliphatic amino acids Ala, Val, Leu and Phe; interchange of the hydroxyl residues Ser and Thr, 

30 exchange of the acidic residues Asp and Glu, substitution between the amide residues Asn and Gin, 
exchange of the basic residues Lys and Arg and replacements among the aromatic residues Phe, 
Tyr. Thus, the fragment, derivative, analog, or homologue of the polypeptide of the present 
invention may be, for example: (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved amino 

35 acid residue) and such substituted amino acid residue may or may not be one encoded by the genetic 
code: or (ii) one in which one or more of the amino acid residues includes a substituent group: or 
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(iii) one in which the GENSET polypeptide is fused with another compound, such as a compound to 
increase the half-life of the polypeptide (for example, polyethylene glycol): or (iv) one in which the 
additional amino acids are fused to the above form of the polypeptide, such as an IgG Fc fusion 
region peptide or leader or secretory sequence or a sequence which is employed for purification of 
5 the above form of the polypeptide or a pro-protein sequence. Such fragments, derivatives and 
analogs are deemed to be within the scope of those skilled in the art from the teachings herein. 

Thus, the GENSET polypeptides of the present invention may include one or more amino 
acid substitutions, deletions, or additions, either from natural mutations or human manipulation. As 
indicated, changes are preferably of a minor nature, such as conservative amino acid substitutions 

10 that do not significantly affect the folding or activity of the protein. The following groups of amino 
acids generally represent equivalent changes: (1) Ala, Pro, Gly, Glu, Asp, Gin, Asn, Ser, Thr; (2) 
Cys, Ser, Tyr, Thr; (3) Val, lie, Leu, Met, Ala, Phe; (4) Lys, Arg, His; (5) Phe, Tyr, Trp, His. 

A specific embodiment of a modified GENSET peptide molecule of interest according to 
the present invention, includes, but is not limited to, a peptide molecule which is resistant to 

15 proteolysis, is a peptide in which the -CONH- peptide bond is modified and replaced by a (CH2NH) 
reduced bond, a (NHCO) retro inverso bond, a (CH2-0) methylene-oxy bond, a (CH2-S) 
thiomethylene bond, a (CH2CH2) carba bond, a (CO-CH2) cetomethylene bond, a (CHOH-CH2) 
hydroxyethylene bond), a (N-N) bound, a E-alcene bond or also a -CH=CH- bond. The invention 
also encompasses a human GENSET polypeptide or a fragment or a variant thereof in which at least 

20 one peptide bond has been modified as described above. 

Amino acids in the GENSET proteins of the present invention that are essential for 
function can be identified by methods known in the art, such as site -directed mutagenesis or 
alanine-scanning mutagenesis {See, e.g., Cunningham et al. 1989), which disclosure is hereby 
incorporated by reference in its entirety. The latter procedure introduces single alanine mutations at 

25 every residue in the molecule. The resulting mutant molecules are then tested for biological activity 
using assays appropriate for measuring the function of the particular protein. Of special interest are 
substitutions of charged amino acids with other charged or neutral amino acids which may produce 
proteins with highly desirable improved characteristics, such as less aggregation. Aggregation may 
not only reduce activity but also be problematic when preparing pharmaceutical formulations, 

30 because aggregates can be immunogenic, (See, e.g., Pinckard et al., 1967; Robbins, et al., 1987; and 
Cleland, et al, 1993). 

A further embodiment of the invention relates to a polypeptide which comprises the amino 
acid sequence of a GENSET polypeptide having an amino acid sequence which contains at least one 
conservative amino acid substitution, but not more than 50 conservative amino acid substitutions, 
35 not more than 40 conservative amino acid substitutions, not more than 30 conservative amino acid 
substitutions, and not more than 20 conservative amino acid substitutions. Also provided are 
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polypeptides which comprise the amino acid sequence of a GENSET polypeptide, having at least 
one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 conservative amino acid substitutions. 



Polypeptide fragments 
Structural definition 

5 The present invention is further directed to fragments of the amino acid sequences described 

herein such as the polypeptides of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384, or full-length or mature polypeptides encoded by the clone inserts of the 
deposited clone pool. More specifically, the present invention embodies purified, isolated, and 
recombinant polypeptides comprising at least 6, preferably at least 8 to 10, more preferably 12, 15, 

10 20, 25, 30, 35, 40, 50, 60, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 350, 400, 450 or 500 
consecutive amino acids of a polypeptide selected from the group consisting of the sequences of 
SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, and 
full-length or mature polypeptides encoded by the clone inserts of the deposited clone pool, and 
other polypeptides of the present invention. 

15 In addition to the above polypeptide fragments, further preferred sub-genuses of 

polypeptides comprise at least 6 amino acids, wherein "at least 6" is defined as any integer between 
6 and the integer representing the C-terminal amino acid of the polypeptide of the present invention 
including the polypeptide sequences of the sequence listing below. Further included are species of 
polypeptide fragments at least 6 amino acids in length, as described above, that are further specified 

20 in terms of their N-terminal and C-terminal positions. However, included in the present invention 
as individual species are all polypeptide fragments, at least 6 amino acids in length, as described 
above, and may be particularly specified by a N-terminal and C-terminal position. That is, every 
combination of a N-terminal and C-terminal position that a fragment at least 6 contiguous amino 
acid residues in length could occupy, on any given amino acid sequence of the sequence listing or 

25 of the present invention is included in the present invention 

The present invention also provides for the exclusion of any fragment species specified by 
N-terminal and C-terminal positions or of any fragment sub-genus specified by size in amino acid 
residues as described above. Any number of fragments specified by N-terminal and C-terminal 
positions or by size in amino acid residues as described above may be excluded as individual 

30 species. 

The above polypeptide fragments of the present invention can be immediately envisaged 
using the above description and are therefore not individually listed solely for the purpose of not 
unnecessarily lengthening the specification. Moreover, the above fragments need not have a 
GENSET biological activity, although polypeptides having these activities are preferred 
35 embodiments of the invention, since they would be useful, for example, in immunoassays, in 
epitope mapping, epitope tagging, as vaccines, and as molecular weight markers. The above 
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fragments may also be used to generate antibodies to a particular portion of the polypeptide. These 
antibodies can then be used in immunoassays well known in the art to distinguish between human 
and non-human cells and tissues or to determine whether cells or tissues in a biological sample are 
or are not of the same type which express the polypeptides of the present invention. 
5 It is noted that the above species of polypeptide fragments of the present invention may 

alternatively be described by the formula "a to b"; where "a" equals the N-terminal most amino acid 
position and "b" equals the C-terminal most amino acid position of the polynucleotide; and further 
where "a" equals an integer between 1 and the number of amino acids of the polypeptide sequence 
of the present invention minus 6, and where "b" equals an integer between 7 and the number of 

10 amino acids of the polypeptide sequence of the present invention; and where "a" is an integer 
smaller then "b" by at least 6. 

The present invention also provides for the exclusion of any species of polypeptide 
fragments of the present invention specified by 5' and 3' positions or sub-genuses of polypeptides 
specified by size in amino acids as described above. Any number of fragments specified by 5' and 

15 3' positions or by size in amino acids, as described above, may be excluded. Specifically excluded 
from the invention are the polypeptide fragments encoded by the preferentially excluded 
polynucleotide fragments described in Table IV, and in Tables Va and Vb. Table IV and Tables Va 
and Vb provide for the exclusion of polypeptides, independently from each other, in addition to 
those described elsewhere in the specification and is therefore, not meant as limiting description. 

20 Functional definition 

Preferred polypeptide fragments of the invention are isolated, purified or recombinant 
polypeptides comprising, consisting of, or consisting essentially of signal peptides, preferably signal 
peptides selected from the group consisting of SEQ ID Nos: 242-272 and 274-384, signal peptides 
encoded by sequences of SEQ ID Nos : 1-31 and 33-143 and those encoded by the clone inserts of 

25 the deposited clone pool. Such polypeptides fragments are useful to design secretion vectors as 
described elsewhere in the application. 

Other preferred polypeptide fragments of the invention are isolated, purified or recombinant 
polypeptides comprising, consisting of, or consisting essentially of mature proteins, preferably 
mature proteins selected from the group consisting of SEQ ED Nos: 242-272 and 274-384, mature 

30 proteins encoded by sequences of SEQ ID Nos: 1-31 and 33-143 and those encoded by the clone 
inserts of the deposited clone pool. 

Domains 

Preferred polynucleotide fragments of the invention are domains of polypeptides of the 
invention. Such domains may eventually comprise linear or structural motifs and signatures 
35 including, but not limited to, leucine zippers, helix-turn-helix motifs, post-translational modification 
sites such as glycosylation sites, ubiquitination sites, alpha helices, and beta sheets, signal sequences 
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encoding signal peptides which direct the secretion of the encoded proteins, sequences implicated in 
transcription regulation such as homeoboxes, acidic stretches, enzymatic active sites, substrate 
binding sites, and enzymatic cleavage sites. Such domains may present a particular biological 
activity such as DNA or RNA-binding, secretion of proteins, transcription regulation, enzymatic 
5 activity, substrate binding activity, etc... 

A domain has a size generally comprised between 3 and 2000 amino acids. In preferred 
embodiment, domains comprise a number of amino acids that is any integer between 6 and 500. 
Domains may be synthesized using any methods known to those skilled in the art, including those 
disclosed herein, particularly in the section entitled "Preparation of the polypeptides of the 
10 invention". Methods for determining the amino acids which make up a domain with a particular 
biological activity include mutagenesis studies and assays to determine the biological activity to be 
tested. 

Alternatively, the polypeptides of the invention may be scanned for motifs, domains and/or 
signatures in databases using any computer method known to those skilled in the art. Searchable 

15 databases include Prosite (Hofmann et al, 1999; Bucher and Bairoch 1994), Pfam (Sonnhammer et 
al, 1997; Henikoff et al, 2000; Bateman et al., 2000), Blocks (Henikoff et al, 2000), Print 
(Attwood et al, 1996), Prodom (Sonnhammer and Kahn, 1994; Corpet et al 2000), Sbase (Pongor 
et al, 1993; Murvai et al, 2000), Smart (Schultz et al, 1998), Dali/FSSP (Holm and Sander, 1996, 
1997 and 1999), HSSP (Sander and Schneider 1991), CATH (Orengo et al., 1997; Pearl et aL, 

20 2000), SCOP (Murzin et al., 1995; Lo Conte et al., 2000), COG (Tatusov et al, 1997 and 2000), 
specific family databases and derivatives thereof (Nevill-Manning et al., 1998; Yona et al., 1999; 
Attwood et al., 2000), each of which disclosures are hereby incorporated by reference in their 
entireties. For a review on available databases, see issue 1 of volume 28 of Nucleic Acid Research 
(2000), which disclosure is hereby incorporated by reference in its entirety. 

25 The polypeptides of SEQ ID NOs : 242-482 were screened for the presence of known 

structural or functional motifs or for the presence of signatures, small amino acid sequences that are 
well conserved amongst the members of a protein family. The search was conducted on the Pfam 
5.5 database using HMMER-2.1.1 (for info see Sonnhammer et Durbin, 

http:/www.sanger.ac.uk/Pfam/ ), on a Blocks Plus database containing Blocks version 12.0, Prints 
30 version 26.0, Pfam version 5.3, Prodom version 99.1, and Domo version 2.0 using emotif (for info 
see Nevill-Manning et al., PNAS, 95, 5865-5871, (1998), http://motif.stanford/edu/EMOTIF ) and on 
the Prosite 16.0 database using bla (Tatusov, R. L. & Koonin, E. V. CABIOS 10, No. 4) and pfscan 
(http://www.isrec.isb-sib.ch/cgi-bin/man.cgi?section=l&topic==pfscan ). Some of these predicted 
domains are described in Table VL For these polypeptides referred to by their sequence 
35 identification numbers (column entitled "Seq Id No"), Table VI gives the designation of the domain 
(column entitled "Designation of domain") according to the database of domains indicated in the 
column entitled "Database " and the positions of preferred fragments within these sequences (column 
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entitled "Positions of domains")- Each fragment is represented by a-b where a and b are the start and 
end positions respectively of a given preferred fragment on the full-length polypeptide. Preferred 
fragments are separated from each other by a coma. As used herein, the term " domain described in 
Table VI " refers to all the domains listed in Table VI for a given GENSET protein referred to by its 
5 sequence identification number in the first column. It should be noted that in Table VI, the first 
methionine encountered is designated as amino acid number 1, i.e;, the leader sequence is not 
numbered negatively. In the appended sequence listing, the first amino acid of the mature protein 
resulting from cleavage of the signal peptide is designated as amino acid number 1 and the first 
amino acid of the signal peptide is designated with the appropriate negative number, in accordance 

10 with the regulations governing sequence listings. 

Consequently, preferred polynucleotide fragments of the invention are domains of the 
polypeptides of SEQ ID Nos: 242-482. Therefore, the present invention encompasses isolated, 
purified, or recombinant polypeptides which consist of, consist essentially of, or comprise a 
contiguous span of at least 6, preferably at least 8 to 10, more preferably 12, 15, 20, 25, 30, 35, 40, 

15 50, 60, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 350, 400, 450 or 500 amino acids of a 
sequence selected from the group consisting of the sequences of SEQ ID Nos: 242-482, to the 
extent that a contiguous span of these lengths is consistent with the lengths of said selected 
sequence, where said contiguous span comprises at least 1, 2, 3, 5, or 10 amino acids positions of a 
domain described in Table VI of said selected sequence. The present invention also encompasses 

20 isolated, purified, or recombinant polypeptides comprising, consisting essentially of, or consisting 
of a contiguous span of at least 6, preferably at least 8 to 10, more preferably 12, 15, 20, 25, 30, 35, 
40, 50, 60, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 350, 400, 450 or 500 amino acids of a 
sequence selected from the group consisting of the sequences of SEQ ED Nos: 242-482, to the 
extent that a contiguous span of these lengths is consistent with the lengths of said selected 

25 sequence, where said contiguous span is a domain described in Table VI of said selected sequence. 
The present invention also encompasses isolated, purified, or recombinant polypeptides which 
comprise, consist of or consist essentially of a domain described in Table VI of a sequence selected 
from the group consisting of the sequences of SEQ ID Nos: 242-482 . 

Polypeptides of the present invention that are not specifically described in this table are not 

30 considered as not belonging to a domain. This is because they may still be not recognized as such 
by the particular algorithms used or not be included in the particular database searched. In fact, all 
fragments of the polypeptides of the present invention, at least 6 amino acids residues in length, are 
included in the present invention as being a domain. Amino acid residues comprising other 
domains may be determined by looking in other databases than the ones currently cited to establish 

35 Table VI. The domains of the present invention preferably comprises 6 to 200 amino acids (i.e. any 
integer between 6 and 200, inclusive) of a polypeptide of the present invention. Also, included in 
the present invention are domain fragments between the integers of 6 and the full length GENSET 
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sequence of the sequence listing. All combinations of sequences between the integers of 6 and the 
full-length sequence of a GENSET polypeptide are included. The domain fragments may be 
specified by either the number of contiguous amino acid residues (as a sub-genus) or by specific N- 
terminal and C-terminal positions (as species) as described above for the polypeptide fragments of 
5 the present invention. Any number of domain fragments of the present invention may also be 
excluded in the same manner. 

Epitopes and Antibody Fusions: 

A preferred embodiment of the present invention is directed to epitope-bearing polypeptides 
and epitope-bearing polypeptide fragments. These epitopes may be " antigenic epitopes " or both an 

10 "antigenic epitope" and an " immunogenic epitope ". An "immunogenic epitope" is defined as a part 
of a protein that elicits an antibody response in vivo when the polypeptide is the immunogen. On 
the other hand, a region of polypeptide to which an antibody binds is defined as an "antigenic 
determinant" or "antigenic epitope." The number of immunogenic epitopes of a protein generally is 
less than the number of antigenic epitopes (See, e.g., Geysen, et a/., 1984), which disclosure is 

15 hereby incorporated by reference in its entirety. It is particularly noted that although a particular 
epitope may not be immunogenic, it is nonetheless useful since antibodies can be made to both 
immunogenic and antigenic epitopes. 

An epitope can comprise as few as 3 amino acids in a spatial conformation, which is unique 
to the epitope. Generally an epitope consists of at least 6 such amino acids, and more often at least 

20 8-10 such amino acids. In preferred embodiment, antigenic epitopes comprise a number of amino 
acids that is any integer between 3 and 50. Fragments which function as epitopes may be produced 
by any conventional means (See f e.g., Houghten, 1985), also further described in U.S. Patent No. 
4,631,21, which disclosures are hereby incorporated by reference in their entireties. Methods for 
determining the amino acids which make up an epitope include x-ray crystallography, 2- 

25 dimensional nuclear magnetic resonance, and epitope mapping, e.g., the Pepscan method described 
by Geysen et al. (1984); PCT Publication No. WO 84/03564; and PCT Publication No. WO 
84/03506, which disclosures are hereby incorporated by reference in their entireties. Another 
example is the algorithm of Jameson and Wolf, (1988) (said reference incorporated by reference in 
its entirety). The Jameson-Wolf antigenic analysis, for example, may be performed using the 

30 computer program PROTEAN, using default parameters (Version 4.0 Windows, DNASTAR, Inc., 
1228 South Park Street Madison, WI. 

Antigenic epitopes predicted by the Jameson-Wolf algorithm for the polypeptides of SEQ 
ID Nos: 242-482 are presented in Table VII. For each GENSET polypeptide referred to by its 
sequence identification number in the column entitled "Seq Id No", a list of antigenic epitopes is 

35 given in the column entitled "Epitopes", each epitope being separated by a coma. Each fragment is 
represented by a-b where a and b are the start and end positions respectively of a given preferred 
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fragment. It should be noted that in Table VII, the first methionine encountered is designated as 
amino acid number 1, i.e; the leader sequence is not numbered negatively. In the appended 
sequence listing, the first amino acid of the mature protein resulting from cleavage of the signal 
peptide is designated as amino acid number 1 and the first amino acid of the signal peptide is 
5 designated with the appropriate negative number, in accordance with the regulations governing 
sequence listings. As used herein, the term " epitope described in Table VII " refers to all preferred 
polynucleotide fragments described in the second column of Table VII for a GENSET polypeptide 
referred to by its sequence identification number in the first column. It is pointed out that the 
immunogenic epitopes listed in Table VII describe only amino acid residues comprising epitopes 

10 predicted to have the highest degree of immunogenicity by a particular algorithm. Polypeptides of 
the present invention that are not specifically described as immunogenic are not considered non- 
antigenic. This is because they may still be antigenic in vivo but merely not recognized as such by 
the particular algorithm used. Alternatively, the polypeptides are most likely antigenic in vitro 
using methods such a phage display. Thus, listed in Table VII are the amino acid residues 

15 comprising only preferred epitopes, not a complete list. In fact, all fragments of the polypeptides of 
the present invention, at least 6 amino acids residues in length, are included in the present invention 
as being useful as antigenic epitope. Amino acid residues comprising other immunogenic epitopes 
may be determined by algorithms similar to the Jameson-Wolf analysis or by in vivo testing for an 
antigenic response using the methods described herein or those known in the art. 

20 Therefore, the present invention encompasses isolated, purified, or recombinant 

polypeptides which consist of, consist essentially of, or comprise a contiguous span of at least 6, 
preferably at least 8 to 10, more preferably 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 125, 150, 175, 
200, 225, 250, 275, 300, 350, 400, 450 or 500 amino acids of a sequence selected from the group 
consisting of the sequences of SEQ ED Nos: 242-482, to the extent that a contiguous span of these 

25 lengths is consistent with the lengths of said selected sequence, where said contiguous span 

comprises at least 1, 2, 3, 5, or 10 amino acids positions of an epitope described in Table VII of said 
selected sequence. The present invention also encompasses isolated, purified, or recombinant 
polypeptides comprising, consisting essentially of, or consisting of a contiguous span of at least 6, 
preferably at least 8 to 10, more preferably 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 125, 150, 175, 

30 200, 225, 250, 275, 300, 350, 400, 450 or 500 amino acids of a sequence selected from the group 
consisting of the sequences of SEQ ID Nos: 242-482, to the extent that a contiguous span of these 
lengths is consistent with the lengths of said selected sequence, where said contiguous span is an 
epitope described in Table VII of said selected sequence. The present invention also encompasses 
isolated, purified, or recombinant polypeptides which comprise, consist of or consist essentially of 

35 an epitope described in Table VII of a sequence selected from the group consisting of the sequences 
of SEQ ID Nos: 242^82. 
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The epitope-bearing fragments of the present invention preferably comprises 6 to 50 amino 
acids (i.e. any integer between 6 and 50, inclusive) of a polypeptide of the present invention. Also, 
included in the present invention are antigenic fragments between the integers of 6 and the full 
length GENSET sequence of the sequence listing. All combinations of sequences between the 
5 integers of 6 and the full-length sequence of a GENSET polypeptide are included. The epitope- 
bearing fragments may be specified by either the number of contiguous amino acid residues (as a 
sub-genus) or by specific N-terminal and C-terminal positions (as species) as described above for 
the polypeptide fragments of the present invention. Any number of epitope-bearing fragments of 
the present invention may also be excluded in the same manner. 

10 Antigenic epitopes are useful, for example, to raise antibodies, including monoclonal 

antibodies that specifically bind the epitope (See, Wilson et al. 9 1984; and Sutcliffe, et al. 9 1983), 
which disclosures are hereby incorporated by reference in their entireties. The antibodies are then 
used in various techniques such as diagnostic and tissue/cell identification techniques, as described 
herein, and in purification methods such as immunoaffinity chromatography. 

15 An antibody or other compound that specifically binds to a polypeptide or polynucleotide of 

the invention is also said to "selectively recognize" the polypeptide or polynucleotide. 

Similarly, immunogenic epitopes can be used to induce antibodies according to methods 
well known in the art (See, Sutcliffe et al., supra; Wilson et al. 9 supra; Chow et <z/.;(1985) and 
Bittle, et ah 9 (1985), which disclosures are hereby incorporated by reference in their entireties). A 

20 preferred immunogenic epitope includes the natural GENSET protein. The immunogenic epitopes 
may be presented together with a carrier protein, such as an albumin, to an animal system (such as 
rabbit or mouse) or, if it is long enough (at least about 25 amino acids), without a carrier. However, 
immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be sufficient 
to raise antibodies capable of binding to, at the very least, linear epitopes in a denatured polypeptide 

25 (e.g., in Western blotting.). 

Epitope-bearing polypeptides of the present invention are used to induce antibodies 
according to methods well known in the art including, but not limited to, in vivo immunization, in 
vitro immunization, and phage display methods {See, e.g., Sutcliffe, et al. 9 supra; Wilson, et al. y 
supra, and Bittle, et al, supra). If in vivo immunization is used, animals may be immunized with 

30 free peptide; however, anti-peptide antibody titer may be boosted by coupling of the peptide to a 
macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid. For instance, 
peptides containing cysteine residues may be coupled to a carrier using a linker such as 
-maleimidobenzoyl- N-hydroxysuccinimide ester (MBS), while other peptides may be coupled to 
carriers using a more general linking agent such as glutaraldehyde. Animals such as rabbits, rats 

35 and mice are immunized with either free or carrier-coupled peptides, for instance, by intraperitoneal 
and/or intradermal injection of emulsions containing about 100 jigs of peptide or carrier protein and 
Freund's adjuvant. Several booster injections may be needed, for instance, at intervals of about two 
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weeks, to provide a useful titer of anti-peptide antibody, which can be detected, for example, by 
ELISA assay using free peptide adsorbed to a solid surface. The titer of anti-peptide antibodies in 
serum from an immunized animal may be increased by selection of anti-peptide antibodies, for 
instance, by adsorption to the peptide on a solid support and elution of the selected antibodies 
5 according to methods well known in the art. 

As one of skill in the art will appreciate, and discussed above, the polypeptides of the 
present invention comprising an immunogenic or antigenic epitope can be fused to heterologous 
polypeptide sequences. For example, the polypeptides of the present invention may be fused with 
the constant domain of immunoglobulins (IgA, IgE, IgG, IgM), or portions thereof (CHI, CH2, 

10 CH3, any combination thereof including both entire domains and portions thereof) resulting in 

chimeric polypeptides. These fusion proteins facilitate purification, and show an increased half-life 
in vivo. This has been shown, e.g., for chimeric proteins consisting of the first two domains of the 
human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins (See, e.g., EPA 0,394,827; and Traunecker et al., 1988), which 

1 5 disclosures are hereby incorporated by reference in their entireties. Fusion proteins that have a 
disulfide-linked dimeric structure due to the IgG portion can also be more efficient in binding and 
neutralizing other molecules than monomeric polypeptides or fragments thereof alone (See, e.g., 
Fountoulakis et al. y 1995), which disclosure is hereby incorporated by reference in its entirety. 
Nucleic acids encoding the above epitopes can also be recombined with a gene of interest as an 

20 epitope tag to aid in detection and purification of the expressed polypeptide. 

Additional fusion proteins of the invention may be generated through the techniques of 
gene-shuffling, motif-shuffling, exon-shuffling, or codon-shuffling (collectively referred to as 
"DNA shuffling"). DNA shuffling may be employed to modulate the activities of polypeptides of 
the present invention thereby effectively generating agonists and antagonists of the polypeptides. 

25 See, for example, U.S. Patent Nos.: 5,605,793; 5,81 1,238; 5,834,252; 5,837,458; and Patten, et al., 
(1997); Harayama, (1998); Hansson, et al (1999); and Lorenzo and Blasco, (1998). (Each of these 
documents are hereby incorporated by reference). In one embodiment, one or more components, 
motifs, sections, parts, domains, fragments, etc., of coding polynucleotides of the invention, or the 
polypeptides encoded thereby may be recombined with one or more components, motifs, sections, 

30 parts, domains, fragments, etc. of one or more heterologous molecules. 

The present invention further encompasses any combination of the polypeptide fragments 
listed in this section. 

Antibodies: 
Definitions 

35 The present invention further relates to antibodies and T-cell antigen receptors (TCR), 

which specifically bind the polypeptides, and more specifically, the epitopes of the polypeptides of 
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the present invention. The antibodies of the present invention include IgG (including IgGl, IgG2, 
IgG3, and IgG4), IgA (including IgAl and IgA2), IgD, IgE, or IgM, and IgY. The term "antibody" 
(Ab) refers to a polypeptide or group of polypeptides which are comprised of at least one binding 
domain, where a binding domain is formed from the folding of variable domains of an antibody 
5 molecule to form three-dimensional binding spaces with an internal surface shape and charge 
distribution complementary to the features of an antigenic determinant of an antigen, which allows 
an immunological reaction with the antigen. As used herein, the term "antibody" is meant to 
include whole antibodies, including single-chain whole antibodies, and antigen binding fragments 
thereof. In a preferred embodiment the antibodies are human antigen binding antibody fragments of 

10 the present invention include, but are not limited to, Fab, Fab 1 F(ab)2 and F(ab , )2, Fd, single-chain 
Fvs (scFv), single-chain antibodies, disulfide-linked Fvs (sdFv) and fragments comprising either a 
V L or V H domain. The antibodies may be from any animal origin including birds and mammals. 
Preferably, the antibodies are human, murine, rabbit, goat, guinea pig, camel, horse, or chicken. 

Antigen-binding antibody fragments, including single-chain antibodies, may comprise the 

15 variable region(s) alone or in combination with the entire or partial of the following: hinge region, 
CHI, CH2, and CH3 domains. Also included in the invention are any combinations of variable 
region(s) and hinge region, CHI, CH2, and CH3 domains. The present invention further includes 
chimeric, humanized, and human monoclonal and polyclonal antibodies, which specifically bind the 
polypeptides of the present invention. The present invention further includes antibodies that are 

20 anti-idiotypic to the antibodies of the present invention. 

The antibodies of the present invention may be monospecific, bispecific, and trispecific or 
have greater multispecificity. Multispecific antibodies may be specific for different epitopes of a 
polypeptide of the present invention or may be specific for both a polypeptide of the present 
invention as well as for heterologous compositions, such as a heterologous polypeptide or solid 

25 support material. See, e.g., WO 93/17715; WO 92/08802; WO 91/00360; WO 92/05793; Tutt, et al. 
(1991); US Patents 5,573,920, 4,474,893, 5,601,819, 4,714,681, 4,925,648; Kostelny et al. (1992), 
which disclosures are hereby incorporated by reference in their entireties. 

Antibodies of the present invention may be described or specified in terms of the epitope(s) 
or epitope-bearing portion(s) of a polypeptide of the present invention, which are recognized or 

30 specifically bound by the antibody. The antibodies may specifically bind a complete protein encoded 
by a nucleic acid of the present invention, or a fragment thereof, particularly, in the case of secreted 
proteins the mature protein or the signal peptide. Therefore, the epitope(s) or epitope bearing 
polypeptide portion(s) may be specified as described herein, e.g., by N-terminal and C-terminal 
positions, by size in contiguous amino acid residues, or otherwise described herein (including the 

35 sequence listing). Antibodies which specifically bind any epitope or polypeptide of the present 
invention may also be excluded as individual species. Therefore, the present invention includes 
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antibodies that specifically bind specified polypeptides of the present invention, and allows for the 
exclusion of the same. 

Thus, another embodiment of the present invention is a purified or isolated antibody 
capable of specifically binding to a polypeptide comprising a sequence selected from the group 
5 consisting of the sequences of SEQ ID Nos: 242-482 and the sequences of the clone inserts of the 
deposited clone pool. In one aspect of this embodiment, the antibody is capable of binding to an 
epitope-containing polypeptide comprising at least 6 consecutive amino acids, preferably at least 8 
to 10 consecutive amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 
consecutive amino acids of a sequence selected from the group consisting of SEQ ID Nos: 242-482 

10 and sequences of the clone inserts of the deposited clone pool. 

Antibodies of the present invention may also be described or specified in terms of their 
cross-reactivity. Antibodies that do not specifically bind any other analog, ortholog, or homologue 
of the polypeptides of the present invention are included. Antibodies that do not bind polypeptides 
with less than 95%, less than 90%, less than 85%, less than 80%, less than 75%, less than 70%, less 

15 than 65%, less than 60%, less than 55%, and less than 50% identity (as calculated using methods 
known in the art and described herein, e.g., using FASTDB and the parameters set forth herein) to a 
polypeptide of the present invention are also included in the present invention. Further included in 
the present invention are antibodies, which only bind polypeptides encoded by polynucleotides, 
which hybridize to a polynucleotide of the present invention under stringent hybridization 

20 conditions (as described herein). Antibodies of the present invention may also be described or 
specified in terms of their binding affinity. Preferred binding affinities include those with a 
dissociation constant or Kd less than 5X10" 6 M, 10" 6 M, 5X10' 7 M, 10" 7 M, 5X1 0~ 8 M, 10" 8 M, 5X10" 9 M, 
10" 9 M, 5X10- ,0 M, 10 ,0 M, 5X10 n M, 10" M M, 5X10" ,2 M, 10 ,2 M, 5X10 U M, 1Q- ,3 M, 5X10" ,4 M, 10" 
M M, 5X10 15 M, and 10 ,5 M. 

25 The invention also concerns a purified or isolated antibody capable of specifically binding 

to a mutated GENSET protein or to a fragment or variant thereof comprising an epitope of the 
mutated GENSET protein. 

Preparation of antibodies 

The antibodies of the present invention may be prepared by any suitable method known in 
30 the art. Some of these methods are described in more detail in the example entitled "Preparation of 
Antibody Compositions to For example, a polypeptide of the present invention or an antigenic 
fragment thereof can be administered to an animal in order to induce the production of sera 
containing "polyclonal antibodies". As used herein, the term " monoclonal antibody " is not limited 
to antibodies produced through hybridoma technology but it rather refers to an antibody that is 
35 derived from a single clone, including eukaryotic, prokaryotic, or phage clone, and not the method 
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by which it is produced. Monoclonal antibodies can be prepared using a wide variety of techniques 
known in the art including the use of hybridoma, recombinant, and phage display technology. 

Hybridoma techniques include those known in the art (See, e.g., Harlow et al. 1988; 
Hammerling, et al 9 1981). (Said references incorporated by reference in their entireties). Fab and 
5 F(ab')2 fragments may be produced, for example, from hybridoma-prbduced antibodies by 
proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to 
produce F(ab')2 fragments). 

Alternatively, antibodies of the present invention can be produced through the application 
of recombinant DNA technology or through synthetic chemistry using methods known in the art. 

10 For example, the antibodies of the present invention can be prepared using various phage display 
methods known in the art. In phage display methods, functional antibody domains are displayed on 
the surface of a phage particle, which carries polynucleotide sequences encoding them. Phage with 
a desired binding property are selected from a repertoire or combinatorial antibody library (e.g. 
human or murine) by selecting directly with antigen, typically antigen bound or captured to a solid 

15 surface or bead. Phage used in these methods are typically filamentous phage including fd and Ml 3 
with Fab, Fv or disulfide stabilized Fv antibody domains recombinantly fused to either the phage 
gene III or gene VIII protein. Examples of phage display methods that can be used to make the 
antibodies of the present invention include those disclosed in Brinkman et al. (1995); Ames, et al. 
(1995); Kettleborough, et al. (1994); Persic, et al. (1997); Burton et al. (1994); PCT/GB91/01 134; 

20 WO 90/02809; WO 91/10737; WO 92/01047; WO 92/18619; WO 93/11236; WO 95/15982; WO 
95/20401; and US Patents 5,698,426, 5,223,409, 5,403,484, 5,580,717, 5,427,908, 5,750,753, 
5,821,047, 5,571,698, 5,427,908, 5,516,637, 5,780,225, 5,658,727 and 5,733,743 (said references 
incorporated by reference in their entireties). 

As described in the above references, after phage selection, the antibody coding regions 

25 from the phage can be isolated and used to generate whole antibodies, including human antibodies, 
or any other desired antigen binding fragment, and expressed in any desired host including 
mammalian cells, insect cells, plant cells, yeast, and bacteria. For example, techniques to 
recombinantly produce Fab, Fab' F(ab)2 and F(ab*)2 fragments can also be employed using 
methods known in the art such as those disclosed in WO 92/22324; Mullinax et al. (1992); and 

30 Sawai et al. (1995); and Better et al. (1988) (said references incorporated by reference in their 
entireties). 

Examples of techniques which can be used to produce single-chain Fvs and antibodies 
include those described in U.S. Patents 4,946,778 and 5,258,498; Huston et al. (1991); Shu et al. 
(1993); and Skerra et al. (1988), which disclosures are hereby incorporated by reference in their 
35 entireties. For some uses, including in vivo use of antibodies in humans and in vitro detection 
assays, it may be preferable to use chimeric, humanized, or human antibodies. Methods for 
producing chimeric antibodies are known in the art. See e.g., Morrison, (1985); Oi et al., (1986); 
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Gillies et aL (1989); and US Patent 5,807,715, which disclosures are hereby incorporated by 
reference in their entireties. Antibodies can be humanized using a variety of techniques including 
CDR-grafting (EP 0 239 400; WO 91/09967; US Patent 5,530,101; and 5,585,089), veneering or 
resurfacing, (EP 0 592 106; EP 0 519 596; Padlan, 1991; Studnicka et aL, 1994; Roguska et aL, 
5 1994), and chain shuffling (US Patent 5,565,332), which disclosures are hereby incorporated by 
reference in their entireties. Human antibodies can be made by a variety of methods known in the 
art including phage display methods described above. See also, US Patents 4,444,887, 4,716,1 11, 
5,545,806, and 5,814,318; WO 98/46645; WO 98/50433; WO 98/24893; WO 96/34096; WO 
96/33735; and WO 91/10741 (said references incorporated by reference in their entireties). 

10 Further included in the present invention are antibodies recombinantly fused or chemically 

conjugated (including both covalently and non-covalently conjugations) to a polypeptide of the 
present invention. The antibodies may be specific for antigens other than polypeptides of the 
present invention. For example, antibodies of the present invention may be recombinantly fused or 
conjugated to molecules useful as labels in detection assays and effector molecules such as 

15 heterologous polypeptides, drugs, or toxins. See, e.g., WO 92/08495; WO 91/14438; WO 89/12624; 
US Patent 5,314,995; and EP 0 396 387, which disclosures are hereby incorporated by reference in 
their entireties. Fused antibodies may also be used to target the polypeptides of the present 
invention to particular cell types, either in vitro or in vivo, by fusing or conjugating the polypeptides 
of the present invention to antibodies specific for particular cell surface receptors. Antibodies fused 

20 or conjugated to the polypeptides of the present invention may also be used in vitro immunoassays 
and purification methods using methods known in the art (See e.g., Harbor et aL supra; WO 
93/21232; EP 0 439 095; Naramura, M. et aL 1994; US Patent 5,474,981; Gillies et aL, 1992; Fell et 
aL, 1991) (said references incorporated by reference in their entireties). 

The present invention further includes compositions comprising the polypeptides of the 

25 present invention fused or conjugated to antibody domains other than the variable regions. For 
example, the polypeptides of the present invention may be fused or conjugated to an antibody Fc 
region, or portion thereof. The antibody portion fused to a polypeptide of the present invention may 
comprise the hinge region, CHI domain, CH2 domain, and CH3 domain or any combination of 
whole domains or portions thereof. The polypeptides of the present invention may be fused or 

30 conjugated to the above antibody portions to increase the in vivo half-life of the polypeptides or for 
use in immunoassays using methods known in the art. The polypeptides may also be fused or 
conjugated to the above antibody portions to form multimers. For example, Fc portions fused to the 
polypeptides of the present invention can form dimers through disulfide bonding between the Fc 
portions. Higher multimeric forms can be made by fusing the polypeptides to portions of IgA and 

35 IgM. Methods for fusing or conjugating the polypeptides of the present invention to antibody 
portions are known in the art. See e.g., US Patents 5,336,603, 5,622,929, 5,359,046, 5,349,053, 
5,447,851, 5,1 12,946; EP 0 307 434, EP 0 367 166; WO 96/04388, WO 91/06570; Ashkenazi et aL 
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(1991); Zheng et al (1995); and Vil et al. (1992) (said references incorporated by reference in their 
entireties). 

Non-human animals or mammals, whether wild-type or transgenic, which express a 
different species of GENSET than the one to which antibody binding is desired, and animals which 
5 do not express GENSET (i.e. a GENSET knock out animal as described herein) are particularly 
useful for preparing antibodies. GENSET knock out animals will recognize all or most of the 
exposed regions of a GENSET protein as foreign antigens, and therefore produce antibodies with a 
wider array of GENSET epitopes. Moreover, smaller polypeptides with only 10 to 30 amino acids 
may be useful in obtaining specific binding to any one of the GENSET proteins . In addition, the 

10 humoral immune system of animals which produce a species of GENSET that resembles the 
antigenic sequence will preferentially recognize the differences between the animal's native 
GENSET species and the antigen sequence, and produce antibodies to these unique sites in the 
antigen sequence. Such a technique will be particularly useful in obtaining antibodies that 
specifically bind to any one of the GENSET proteins . 

15 The antibodies of the invention may be labeled by any one of the radioactive, fluorescent or 

enzymatic labels known in the art. 

Uses of pol ynucleotides 

Uses of polynucleotides as reagents 

The polynucleotides of the present invention, particularly those described in the 

20 "Oligonucleotide primers and probes" section, may be used as reagents in isolation procedures, 
diagnostic assays, and forensic procedures. For example, sequences from the GENSET 
polynucleotides of the invention may be detectably labeled and used as probes to isolate other 
sequences capable of hybridizing to them. In addition, sequences from the GENSET 
polynucleotides of the invention may be used to design PCR primers to be used in isolation, 

25 diagnostic, or forensic procedures. 

In forensic analyses 

PCR primers may be used in forensic analyses, such as the DNA fingerprinting techniques 
described below. Such analyses may utilize detectable probes or primers based on the sequences of 
the polynucleotides of the invention. Consequently, the present invention encompasses methods of 
30 identification of an individual using the polynucleotides of the invention in forensic analyses, 
wherein said method includes the steps of: 

a) obtaining a biological sample containing nucleic acid material from an individual; 

b) obtaining an identification pattern for this individual using the polynucleotides of the 
invention, particularly using GENSET primers and probes; 

35 c) comparing said identification pattern with a reference identification pattern; and 
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d) determining whether said identification pattern is identical to said reference identification 

pattern. 

In one embodiment of this method, the identification pattern consists in sequences of 
amplicons obtained using GENSET primers as explained in the sections entitled "Forensic 
5 Matching by DNA Sequencing" and "Positive Identification by DNA Sequencing". 

In another embodiment, the identification pattern consists in unique band or dot patterns 
obtained using any method described in the sections entitled "Southern Blot Forensic 
Identification", "Dot Blot Identification Procedure" and "Alternative "Fingerprint" Identification 
Technique". 

10 Forensic Matching by DNA Sequencing 

In one exemplary method, DNA samples are isolated from forensic specimens of, for 
example, hair, semen, blood or skin cells by conventional methods. A panel of PCR primers 
designed from different polynucleotides of the invention using any technique known to those skilled 
in the art including those described herein, is then utilized to amplify DNA of approximately 100- 

15 200 bases in length from the forensic specimen. Corresponding sequences are obtained from a test 
subject. Each of these identification DNAs is then sequenced using standard techniques, and a 
simple database comparison determines the differences, if any, between the sequences from the 
subject and those from the sample. Statistically significant differences between the suspect's DNA 
sequences and those from the sample conclusively prove a lack of identity. This lack of identity can 

20 be proven, for example, with only one sequence. Identity, on the other hand, should be 

demonstrated with a large number of sequences, all matching. Preferably, a minimum of 50 
statistically identical sequences of 1 00 bases in length are used to prove identity between the 
suspect and the sample. 

Positive Identification by DNA Sequencing 

25 The "Forensic Matching by DNA Sequencing" technique described herein may also be used 

on a larger scale to provide a unique fingerprint-type identification of any individual. In this 
technique, primers are prepared from a large number of polynucleotides of the invention. 
Preferably, 20 to 50 different primers are used. These primers are used to obtain a corresponding 
number of PCR-generated DNA segments from the individual in question. Each of these DNA 

30 segments is sequenced. The database of sequences generated through this procedure uniquely 

identifies the individual from whom the sequences were obtained. The same panel of primers may 
then be used at any later time to absolutely correlate tissue or other biological specimen with that 
individual. 

Southern Blot Forensic Identification 
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The "Positive Identification by DNA Sequencing" procedure described herein is repeated to 
obtain a panel of at least 10 amplified sequences from an individual and a specimen. Preferably, the 
panel contains at least 50 amplified sequences. More preferably, the panel contains 100 amplified 
sequences. In some embodiments, the panel contains 200 amplified sequences. This PCR- 
5 generated DNA is then digested with one or a combination of, preferably, four base specific 
restriction enzymes. Such enzymes are commercially available and known to those of skill in the 
art. After digestion, the resultant gene fragments are size separated in multiple duplicate wells on 
an agarose gel and transferred to nitrocellulose using Southern blotting techniques well known to 
those with skill in the art. For a review of Southern blotting see Davis et al (1986), which 

10 disclosure is hereby incorporated by reference in its entirety. 

A panel of probes based on the sequences of the polynucleotides of the invention, or 
fragments thereof of at least 10 bases, are radioactively or colorimetrically labeled using methods 
known in the art, such as nick translation or end labeling, and hybridized to the Southern blot using 
techniques known in the art. Preferably, the probe comprises at least 12, 15, or 17 consecutive 

15 nucleotides from the polynucleotide of the invention. More preferably, the probe comprises at least 
20-30 consecutive nucleotides from the polynucleotide of the invention. In some embodiments, the 
probe comprises more than 30 nucleotides from the polynucleotide of the invention. In other 
embodiments, the probe comprises at least 40, at least 50, at least 75, at least 100, at least 150, or at 
least 200 consecutive nucleotides from the polynucleotide of the invention. 

20 Preferably, at least 5 to 10 of these labeled probes are used, and more preferably at least 

about 20 or 30 are used to provide a unique pattern. The resultant bands appearing from the 
hybridization of a large sample of polynucleotide of the invention will be a unique identifier. Since 
the restriction enzyme cleavage will be different for every individual, the band pattern on the 
Southern blot will also be unique. Increasing the number of cDNA probes will provide a 

25 statistically higher level of confidence in the identification since there will be an increased number 
of sets of bands used for identification. 

Dot Blot Identification Procedure 

Another technique for identifying individuals using the polynucleotide sequences disclosed 
herein utilizes a dot blot hybridization technique. 

30 Genomic DNA is isolated from nuclei of subject to be identified. Oligonucleotide probes of 

approximately 30 bp in length are synthesized that correspond to at least 10, preferably 50 
sequences from the polynucleotide of the invention. The probes are used to hybridize to the 
genomic DNA through conditions known to those in the art. The oligonucleotides are end labeled 
with P 32 using polynucleotide kinase (Pharmacia). Dot Blots are created by spotting the genomic 

35 DNA onto nitrocellulose or the like using a vacuum dot blot manifold (BioRad, Richmond 

California). The nitrocellulose filter containing the genomic sequences is baked or UV linked to the 
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filter, prehybridized and hybridized with labeled probe using techniques known in the art (Davis et 
ai 1986). The 32 P labeled DNA fragments are sequentially hybridized with successively stringent 
conditions to detect minimal differences between the 30 bp sequence and the DNA. 
Tetramethylammonium chloride is useful for identifying clones containing small numbers of 
5 nucleotide mismatches (Wood et al. 9 1985). A unique pattern of dots distinguishes one individual 
from another individual. 

Alternative "Fingerprint" Identification Technique 

In a representative alternative fingerprinting procedure, the probes are derived from cDNAs. 
Preferably, a plurality of probes having sequences from different genes are used as follows. 

10 Polynucleotides containing at least 10 consecutive bases from these sequences can be used as 
probes. Preferably, the probe comprises at least 12, 15, or 17 consecutive nucleotides from the 
polynucleotide of the invention. More preferably, the probe comprises at least 20-30 consecutive 
nucleotides from the polynucleotide of the invention. In some embodiments, the probe comprises 
more than 30 nucleotides from the polynucleotide of the invention. In other embodiments, the 

15 probe comprises at least 40, at least 50, at least 75, at least 100, at least 150, or at least 200 
consecutive nucleotides from the polynucleotide of the invention. 

Oligonucleotides, generally 20-mers, are prepared from a large number, e.g. 50, 100, or 
200, of polynucleotides of the invention using commercially available oligonucleotide services such 
as Genset, Paris, France. Cell samples from the test subject are processed for DNA using 

20 techniques well known to those with skill in the art. The nucleic acid is digested with restriction 
enzymes such as EcoRI and Xbal. Following digestion, samples are applied to wells for 
electrophoresis. The procedure, as known in the art, may be modified to accommodate 
polyacrylamide electrophoresis, however in this example, samples containing 5 ug of DNA are 
loaded into wells and separated on 0.8% agarose gels. The gels are transferred onto nitrocellulose 

25 using standard Southern blotting techniques. 

10 ng of each of the oligonucleotides are pooled and end-labeled with P 32 . The 
nitrocellulose is prehybridized with blocking solution and hybridized with the labeled probes. 
Following hybridization and washing, the nitrocellulose filter is exposed to X-Omat AR X-ray film. 
The resulting hybridization pattern will be unique for each individual. 

30 It is additionally contemplated within this example that the number of probe sequences used 

can be varied for additional accuracy or clarity. 

To find corresponding genomic DNA sequences 

The GENSET cDNAs of the invention may also be used to clone sequences located 
upstream of the cDNAs of the invention on the corresponding genomic DNA. Such upstream 
35 sequences may be capable of regulating gene expression, including promoter sequences, enhancer 
sequences, and other upstream sequences which influence transcription or translation levels. Once 
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identified and cloned, these upstream regulatory sequences may be used in expression vectors 
designed to direct the expression of an inserted gene in a desired spatial, temporal, developmental, 
or quantitative fashion. 

Use of cDNAs or Fragments thereof to Clone Upstream Sequences from Genomic DNA 
5 Sequences derived from polynucleotides of the inventions may be used to isolate the 

promoters of the corresponding genes using chromosome walking techniques. In one chromosome 
walking technique, which utilizes the GenomeWalker™ kit available from Clontech, five complete 
genomic DNA samples are each digested with a different restriction enzyme which has a 6 base 
recognition site and leaves a blunt end. Following digestion, oligonucleotide adapters are ligated to 

10 each end of the resulting genomic DNA fragments. 

For each of the five genomic DNA libraries, a first PCR reaction is performed according to 
the manufacturer's instructions (which are incorporated herein by reference) using an outer adaptor 
primer provided in the kit and an outer gene specific primer. The gene specific primer should be 
selected to be specific for the polynucleotide of the invention of interest and should have a melting 

15 temperature, length, and location in the polynucleotide of the invention which is consistent with its 
use in PCR reactions. Each first PCR reaction contains 5ng of genomic DNA, 5 |il of 10X Tth 
reaction buffer, 0.2 mM of each dNTP, 0.2 |iM each of outer adaptor primer and outer gene specific 
primer, 1.1 mM of Mg(OAc) 2 , and 1 fj.1 of the Tth polymerase SOX mix in a total volume of 50 
The reaction cycle for the first PCR reaction is as follows: 1 min at 94 degree Celsius / 2 sec at 94 

20 degree Celsius, 3 min at 72 degree Celsius (7 cycles) / 2 sec at 94 degree Celsius, 3 min at 67 
degree Celsius (32 cycles) / 5 min at 67 degree Celsius. 

The product of the first PCR reaction is diluted and used as a template for a second PCR 
reaction according to the manufacturer's instructions using a pair of nested primers which are 
located internally on the amplicon resulting from the first PCR reaction. For example, 5 f-tl of the 

25 reaction product of the first PCR reaction mixture may be diluted 1 80 times. Reactions are made in 
a 50 \x\ volume having a composition identical to that of the first PCR reaction except the nested 
primers are used. The first nested primer is specific for the adaptor, and is provided with the 
GenomeWalker™ kit. The second nested primer is specific for the particular polynucleotide of the 
invention for which the promoter is to be cloned and should have a melting temperature, length, and 

30 location in the polynucleotide of the invention which is consistent with its use in PCR reactions. 
The reaction parameters of the second PCR reaction are as follows: 1 min at 94 degree Celsius / 2 
sec at 94 degree Celsius, 3 min at 72 degree Celsius (6 cycles) / 2 sec at 94 degree Celsius, 3 min at 
67 degree Celsius (25 cycles) / 5 min at 67 degree Celsius 

The product of the second PCR reaction is purified, cloned, and sequenced using standard 

35 techniques. Alternatively, two or more human genomic DNA libraries can be constructed by using 
two or more restriction enzymes. The digested genomic DNA is cloned into vectors which can be 
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converted into single stranded, circular, or linear DNA. A biotinylated oligonucleotide comprising 
at least 15 nucleotides from the polynucleotide of the invention sequence is hybridized to the single 
stranded DNA. Hybrids between the biotinylated oligonucleotide and the single stranded DNA 
containing the polynucleotide of the invention sequence are isolated as described herein. 
5 Thereafter, the single stranded DNA containing the polynucleotide of the invention sequence is 
released from the beads and converted into double stranded DNA using a primer specific for the 
polynucleotide of the invention sequence or a primer corresponding to a sequence included in the 
cloning vector. The resulting double stranded DNA is transformed into bacteria. DNAs containing 
the GENSET polynucleotide sequences are identified by colony PCR or colony hybridization. 

10 Identification of Promoters in Cloned Upstream Sequences 

Once the upstream genomic sequences have been cloned and sequenced as described above, 
prospective promoters and transcription start sites within the upstream sequences may be identified 
by comparing the sequences upstream of the polynucleotides of the inventions with databases 
containing known transcription start sites, transcription factor binding sites, or promoter sequences. 

15 In addition, promoters in the upstream sequences may be identified using promoter reporter 

vectors as follows. The expression of the reporter gene will be detected when placed under the 
control of regulatory active polynucleotide fragments or variants of the GENSET promoter region 
located upstream of the first exon of the GENSET gene. Suitable promoter reporter vectors, into 
which the GENSET promoter sequences may be cloned include pSEAP-Basic, pSEAP-Enhancer, 

20 pPgal-Basic, pPgal -Enhancer, or pEGFP-1 Promoter Reporter vectors available from Clontech, or 
pGL2-basic or pGL3 -basic promoterless luciferase reporter gene vector from Promega. Briefly, 
each of these promoter reporter vectors include multiple cloning sites positioned upstream of a 
reporter gene encoding a readily assayable protein such as secreted alkaline phosphatase, luciferase, 
beta-galactosidase, or green fluorescent protein. The sequences upstream the GENSET coding 

25 region are inserted into the cloning sites upstream of the reporter gene in both orientations and 
introduced into an appropriate host cell. The level of reporter protein is assayed and compared to 
the level obtained from a vector which lacks an insert in the cloning site. The presence of an 
elevated expression level in the vector containing the insert with respect to the control vector 
indicates the presence of a promoter in the insert. If necessary, the upstream sequences can be 

30 cloned into vectors which contain an enhancer for increasing transcription levels from weak 

promoter sequences. A significant level of expression above that observed with the vector lacking 
an insert indicates that a promoter sequence is present in the inserted upstream sequence. 

Promoter sequence within the upstream genomic DNA may be further defined by site 
directed mutagenesis, linker scanning analysis, or other techniques familiar to those skilled in the 

35 art. For example, the boundaries of promoters may be further investigated by constructing nested 5' 
and/or 3' deletions in the upstream DNA using conventional techniques such as Exonuclease III or 
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appropriate restriction endonuclease digestion. The resulting deletion fragments can be inserted 
into the promoter reporter vector to determine whether the deletion has increased, reduced or 
illuminated promoter activity, such as described, for example, by Coles et al. (1998), the disclosure 
of which is incorporated herein by reference in its entirety. In this way, the boundaries of the 
5 promoters may be defined. If desired, potential individual regulatory sites within the promoter may 
be identified using site directed mutagenesis or linker scanning to obliterate potential transcription 
factor binding sites within the promoter individually or in combination. The effects of these 
mutations on transcription levels may be determined by inserting the mutations into cloning sites in 
promoter reporter vectors. This type of assay is well known to those skilled in the art and is 

10 described in WO 97/17359, US Patent No. 5,374,544; EP 582 796; US Patent No. 5,698,389; US 
5,643,746; US Patent No. 5,502,176; and US Patent 5,266,488; the disclosures of which are 
incorporated by reference herein in their entirety. 

The strength and the specificity of the promoter of each GENSET gene can be assessed 
through the expression levels of a detectable polynucleotide operably linked to the GENSET 

15 promoter in different types of cells and tissues. The detectable polynucleotide may be either a 
polynucleotide that specifically hybridizes with a predefined oligonucleotide probe, or a 
polynucleotide encoding a detectable protein, including a GENSET polypeptide or a fragment or a 
variant thereof. This type of assay is well known to those skilled in the art and is described in US 
Patent No. 5,502,176; and US Patent No. 5,266,488; the disclosures of which are incorporated by 

20 reference herein in their entirety. Some of the methods are discussed in more detail elsewhere in 
the application. 

The promoters and other regulatory sequences located upstream of the polynucleotides of 
the inventions may be used to design expression vectors capable of directing the expression of an 
inserted gene in a desired spatial, temporal, developmental, or quantitative manner. A promoter 

25 capable of directing the desired spatial, temporal, developmental, and quantitative patterns may be 
selected using the results of the expression analysis described herein. For example, if a promoter 
which confers a high level of expression in muscle is desired, the promoter sequence upstream of a 
polynucleotide of the invention derived from an mRNA which is expressed at a high level in muscle 
may be used in the expression vector. Such vectors are described in more detail elsewhere in the 

30 application. 

Preferably, the desired promoter is placed near multiple restriction sites to facilitate the 
cloning of the desired insert downstream of the promoter, such that the promoter is able to drive 
expression of the inserted gene. The promoter may be inserted in conventional nucleic acid 
backbones designed for extrachromosomal replication, integration into the host chromosomes or 
35 transient expression. Suitable backbones for the present expression vectors include retroviral 
backbones, backbones from eukaryotic episomes such as SV40 or Bovine Papilloma Virus, 
backbones from bacterial episomes, or artificial chromosomes. 
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Preferably, the expression vectors also include a polyA signal downstream of the multiple 
restriction sites for directing the polyadenylation of mRNA transcribed from the gene inserted into 
the expression vector. 

To find similar sequences 

5 Polynucleotides of the invention may be used to isolate and/or purify nucleic acids similar 

thereto using any methods well known to those skilled in the art including the techniques based on 
hybridization or on amplification described in this section. These methods may be used to obtain the 
genomic DNAs which encode the mRNAs from which the GENSET cDNAs are derived, mRNAs 
corresponding to GENSET cDNAs, or nucleic acids which are homologous to GENSET cDNAs or 

10 fragments thereof, such as variants, species homologues or orthologs. Thus, a plurality of cDNAs 
similar to GENSET polynucleotides may be provided as cDNA libraries for subsequent evaluation 
of the encoded proteins or use in diagnostic assays as described herein. cDNAs prepared by any 
method described therein may be subsequently engineered to obtain nucleic acids which include 
desired fragments of the cDNA using conventional techniques such as subcloning, PCR, or in vitro 

15 oligonucleotide synthesis. For example, nucleic acids which include only the coding sequences 
may be obtained using techniques known to those skilled in the art. Similarly, nucleic acids 
containing any other desired fragment of the coding sequences for the encoded protein may be 
obtained. 

Indeed, cDNAs of the present invention or fragments thereof may be used to isolate nucleic 
20 acids similar to cDNAs from a cDNA library or a genomic DNA library. Such cDNA libraries or 
genomic DNA libraries may be obtained from a commercial source or made using techniques 
familiar to those skilled in the art such as those described in PCT publication WO 00/37491, which 
disclosure is hereby incorporated by reference in its entirety. Examples of methods for obtaining 
nucleic acids similar to GENSET polynucleotides are described below. 

25 Hybridization-based methods 

Techniques for identifying cDNA clones in a cDNA library which hybridize to a given 
probe sequence are disclosed in Sambrook et aL, (1989) and in Hames and Higgins (1985), the 
disclosures of which are incorporated herein by reference in their entireties. The same techniques 
may be used to isolate genomic DNAs. 

30 Briefly, cDNA or genomic DNA clones which hybridize to the detectable probe are 

identified and isolated for further manipulation as follows. Any polynucleotide fragment of the 
invention may be used as a probe, in particular those defined in the "Oligonucleotide primers and 
probes" section. A probe comprising at least 10 consecutive nucleotides from a GENSET cDNA or 
fragment thereof is labeled with a detectable label such as a radioisotope or a fluorescent molecule. 

35 Preferably, the probe comprises at least 12, 15, or 17 consecutive nucleotides from the cDNA or 
fragment thereof. More preferably, the probe comprises 20 to 30 consecutive nucleotides from the 
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cDNA or fragment thereof. In some embodiments, the probe comprises more than 30 nucleotides 
from the cDNA or fragment thereof 

Techniques for labeling the probe are well known and include phosphorylation with 
polynucleotide kinase, nick translation, in vitro transcription, and non radioactive techniques. The 
5 cDNAs or genomic DNAs in the library are transferred to a nitrocellulose or nylon filter and 

denatured. After blocking of non specific sites, the filter is incubated with the labeled probe for an 
amount of time sufficient to allow binding of the probe to cDNAs or genomic DNAs containing a 
sequence capable of hybridizing thereto. 

By varying the stringency of the hybridization conditions used to identify cDNAs or 
10 genomic DNAs which hybridize to the detectable probe, cDNAs or genomic DNAs having different 
levels of identity to the probe can be identified and isolated as described below. 

Stringent conditions 

" Stringent hybridization conditions " are defined as conditions in which only nucleic acids 
having a high level of identity to the probe are able to hybridize to said probe. These conditions may 
15 be calculated as follows: 

For probes between 14 and 70 nucleotides in length the melting temperature (Tm) is 
calculated using the formula: Tm=81 .5+16.6(log (Na+))+0.41 (fraction G+C)-(600/N) where N is 
the length of the probe. 

If the hybridization is carried out in a solution containing formamide, the melting 
20 temperature may be calculated using the equation: Tm=81.5+16.6(log (Na+))+0.41 (fraction G+C)- 
(0.63% formamide)-(600/N) where N is the length of the probe. 

Prehybridization may be carried out in 6X SSC, 5X Denhardt's reagent, 0.5% SDS, 100 \xg 
denatured fragmented salmon sperm DNA or 6X SSC, 5X Denhardt's reagent, 0.5% SDS, 100 ng 
denatured fragmented salmon sperm DNA, 50% formamide. The formulas for SSC and Denhardt's 
25 solutions are listed in Sambrook et aL, 1986. 

Hybridization is conducted by adding the detectable probe to the prehybridization solutions 
listed above. Where the probe comprises double stranded DNA, it is denatured before addition to 
the hybridization solution. The filter is contacted with the hybridization solution for a sufficient 
period of time to allow the probe to hybridize to nucleic acids containing sequences complementary 
30 thereto or homologous thereto. For probes over 200 nucleotides in length, the hybridization may be 
carried out at 1 5-25°C below the Tm. For shorter probes, such as oligonucleotide probes, the 
hybridization may be conducted at 15-25°C below the Tm. Preferably, for hybridizations in 6X 
SSC, the hybridization is conducted at approximately 68°C. Preferably, for hybridizations in 50% 
formamide containing solutions, the hybridization is conducted at approximately 42°C. 
35 Following hybridization, the filter is washed in 2X SSC, 0.1% SDS at room temperature for 

15 minutes. The filter is then washed with 0.1X SSC, 0.5% SDS at room temperature for 30 
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minutes to 1 hour. Thereafter, the solution is washed at the hybridization temperature in 0.1X SSC, 
0.5% SDS. A final wash is conducted in 0.1X SSC at room temperature. 

Nucleic acids which have hybridized to the probe are identified by autoradiography or other 
conventional techniques. 

5 Low and moderate conditions 

Changes in the stringency of hybridization and signal detection are primarily accomplished 
through the manipulation of formamide concentration (lower percentages of formamide result in 
lowered stringency); salt conditions, or temperature. The above procedure may thus be modified to 
identify nucleic acids having decreasing levels of identity to the probe sequence. For example, the 

10 hybridization temperature may be decreased in increments of 5°C from 68°C to 42°C in a 

hybridization buffer having a sodium concentration of approximately 1M. Following hybridization, 
the filter may be washed with 2X SSC, 0.5% SDS at the temperature of hybridization. These 
conditions are considered to be "moderate" conditions above 50°C and "low" conditions below 
50°C. Alternatively, the hybridization may be carried out in buffers, such as 6X SSC, containing 

15 formamide at a temperature of 42°C. In this case, the concentration of formamide in the 

hybridization buffer may be reduced in 5% increments from 50% to 0% to identify clones having 
decreasing levels of identity to the probe. Following hybridization, the filter may be washed with 
6X SSC, 0.5% SDS at 50°C. These conditions are considered to be "moderate" conditions above 
25% formamide and "low" conditions below 25% formamide. cDNAs or genomic DNAs which 

20 have hybridized to the probe are identified by autoradiography or other conventional techniques. 

Note that variations in the above conditions may be accomplished through the inclusion 
and/or substitution of alternate blocking reagents used to suppress background in hybridization 
experiments. Typical blocking reagents include Denhardt's reagent, BLOTTO, heparin, denatured 
salmon sperm DNA, and commercially available proprietary formulations. The inclusion of specific 

25 blocking reagents may require modification of the hybridization conditions described above, due to 
problems with compatibility. 

Consequently, the present invention encompasses methods of isolating nucleic acids similar 
to the polynucleotides of the invention, comprising the steps of: 

a) contacting a collection of cDNA or genomic DNA molecules with a detectable probe 

30 comprising at least 12, 15, 18, 20, 23, 25, 28, 30, 35, 40 or 50 consecutive nucleotides of a sequence 
selected from the group consisting of the sequences of SEQ ED Nos: 1-241, the sequences of clones 
inserts of the deposited clone pool and sequences complementary thereto under stringent, moderate 
or low conditions which permit said probe to hybridize to at least a cDNA or genomic DNA molecule 
in said collection; 

35 b) identifying said cDNA or genomic DNA molecule which hybridizes to said detectable 

probe; and 
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c) isolating said cDNA or genomic DNA molecule which hybridized to said probe. 

PCR-based methods 

In addition to the above described methods, other protocols are available to obtain 
homologous cDNAs using GENSET cDNA of the present invention or fragment thereof as outlined 
5 in the following paragraphs. 

cDNAs may be prepared by obtaining mRNA from the tissue, cell, or organism of interest 
using mRNA preparation procedures utilizing polyA selection procedures or other techniques 
known to those skilled in the art. A first primer capable of hybridizing to the polyA tail of the 
mRNA is hybridized to the mRNA and a reverse transcription reaction is performed to generate a 

10 first cDN A strand. 

The term " capable of hybridizing to the polvA tail of said mRNA " refers to and embraces 
all primers containing stretches of thymidine residues, so-called oligo(dT) primers, that hybridize to 
the 3' end of eukaryotic poly(A)+ mRNAs to prime the synthesis of a first cDNA strand. 
Techniques for generating said oligo (dT) primers and hybridizing them to mRNA to subsequently 

15 prime the reverse transcription of said hybridized mRNA to generate a first cDNA strand are well 
known to those skilled in the art and are described in Current Protocols in Molecular Biology, John 
Wiley and Sons, Inc. 1997 and Sambrook et al., 1989. Preferably, said oligo (dT) primers are 
present in a large excess in order to allow the hybridization of all mRNA 3'ends to at least one oligo 
(dT) molecule. The priming and reverse transcription steps are preferably performed between 37°C 

20 and 55°C depending on the type of reverse transcriptase used. Preferred oligo(dT) primers for 
priming reverse transcription of mRNAs are oligonucleotides containing a stretch of thymidine 
residues of sufficient length to hybridize specifically to the polyA tail of mRNAs, preferably of 12 
to 1 8 thymidine residues in length. More preferably, such oligo(T) primers comprise an additional 
sequence upstream of the poly(dT) stretch in order to allow the addition of a given sequence to the 

25 5 'end of all first cDNA strands which may then be used to facilitate subsequent manipulation of the 
cDNA. Preferably, this added sequence is 8 to 60 residues in length. For instance, the addition of a 
restriction site in 5' of cDNAs facilitates subcloning of the obtained cDNA. Alternatively, such an 
added 5'end may also be used to design primers of PCR to specifically amplify cDNA clones of 
interest. 

30 The first cDNA strand is then hybridized to a second primer. Any polynucleotide fragment 

of the invention may be used, and in particular those described in the "Oligonucleotide primers and 
probes" section. This second primer contains at least 10 consecutive nucleotides of a polynucleotide 
of the invention. Preferably, the primer comprises at least 10, 12, 15, 17, 18, 20, 23, 25, or 28 
consecutive nucleotides of a polynucleotide of the invention. In some embodiments, the primer 

35 comprises more than 30 nucleotides of a polynucleotide of the invention. If it is desired to obtain 
cDNAs containing the full protein coding sequence, including the authentic translation initiation 
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site, the second primer used contains sequences located upstream of the translation initiation site. 
The second primer is extended to generate a second cDNA strand complementary to the first cDNA 
strand. Alternatively, RT-PCR may be performed as described above using primers from both ends 
of the cDNA to be obtained. 
5 The double stranded cDNAs made using the methods described above are isolated and 

cloned. The cDNAs may be cloned into vectors such as plasmids or viral vectors capable of 
replicating in an appropriate host cell. For example, the host cell may be a bacterial, mammalian, 
avian, or insect cell. 

Techniques for isolating mRNA, reverse transcribing a primer hybridized to mRNA to 
10 generate a first cDNA strand, extending a primer to make a second cDNA strand complementary to 
the first cDNA strand, isolating the double stranded cDNA and cloning the double stranded cDNA 
are well known to those skilled in the art and are described in Current Protocols in Molecular 
Biology, John Wiley & Sons, Inc. 1997 and Sambrook et al., 1989. 

Consequently, the present invention encompasses methods of making cDNAs. In a first 
15 embodiment, the method of making a cDNA comprises the steps of 

a) contacting a collection of mRNA molecules from human cells with a primer comprising 
at least 12, 15, 18, 20, 23, 25, 28, 30, 35, 40, or 50 consecutive nucleotides of a sequence selected 
from the group consisting of the sequences complementary to SEQ ID Nos: 1-241 and sequences 
complementary to a clone insert of the deposited clone pool; 

20 b) hybridizing said primer to an mRNA in said collection; 

c) reverse transcribing said hybridized primer to make a first cDNA strand from said 

mRNA; 

d) making a second cDNA strand complementary to said first cDNA strand; and 

e) isolating the resulting cDNA comprising said first cDNA strand and said second cDNA 

25 strand. 

Another embodiment of the present invention is a purified cDNA obtainable by the method 
of the preceding paragraph. In one aspect of this embodiment, the cDNA encodes at least a portion 
of a human polypeptide. 

In a second embodiment, the method of making a cDNA comprises the steps of 
30 a) contacting a collection of mRNA molecules from human cells with a first primer capable 

of hybridizing to the polyA tail of said mRNA; 

b) hybridizing said first primer to said polyA tail; 

c) reverse transcribing said mRNA to make a first cDNA strand; 

d) making a second cDNA strand complementary to said first cDNA strand using at least 
35 one primer comprising at least 12, 15, 18, 20, 23, 25, 28, 30, 35, 40, or 50 consecutive nucleotides of 

a sequence selected from the group consisting of SEQ ID Nos: 1-241 and sequences of clone inserts 
of the deposited clone pool; and 
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e) isolating the resulting cDNA comprising said first cDNA strand and said second cDNA 

strand. 

In another aspect of this method the second cDNA strand is made by 

a) contacting said first cDNA strand with a second primer comprising at least 12, 15, 18, 20, 
5 23, 25, 28, 30, 35, 40, or 50 consecutive nucleotides of a sequence selected from the group consisting 

of SEQ ID Nos: 1-241 and sequences of clone inserts of the deposited clone pool, and a third primer 
which sequence is fully included within the sequence of said first primer; 

b) performing a first polymerase chain reaction with said second and third primers to 
generate a first PCR product; 

10 c) contacting said first PCR product with a fourth primer, comprising at least 12, 15, 18, 20, 

23, 25, 28, 30, 35, 40, or 50 consecutive nucleotides of said sequence selected from the group 
consisting of SEQ ID Nos: 1-241 and sequences of clone inserts of the deposited clone pool, and a 
fifth primer, which sequence is fully included within the sequence of said third primer, wherein said 
fourth and fifth hybridize to sequences within said first PCR product; and 

15 d) performing a second polymerase chain reaction, thereby generating a second PCR 

product. 

Alternatively, the second cDNA strand may be made by contacting said first cDNA strand 
with a second primer comprising at least 12, 15, 18, 20, 23, 25, 28, 30, 35, 40, or 50 consecutive 
nucleotides of a sequence selected from the group consisting of SEQ ID Nos: 1 -241 and sequences 
20 of clone inserts of the deposited clone pool, and a third primer which sequence is fully included 
within the sequence of said first primer and performing a polymerase chain reaction with said 
second and third primers to generate said second cDNA strand. 

Alternatively, the second cDNA strand may be made by 

a) contacting said first cDNA strand with a second primer comprising at least 12, 15, 18, 20, 
25 23, 25, 28, 30, 35, 40, or 50 consecutive nucleotides of a sequence selected from the group consisting 

of SEQ ID Nos: 1-241 and sequences of clone inserts of the deposited clone pool; 

b) hybridizing said second primer to said first strand cDNA; and 

c) extending said hybridized second primer to generate said second cDNA strand. 
Another embodiment of the present invention is a purified cDNA obtainable by a method of 

30 making a cDNA of the invention. In one aspect of this embodiment, said cDNA encodes at least a 
portion of a human polypeptide. 

Other protocols 

Alternatively, other procedures may be used for obtaining homologous cDNAs. In one 
approach, cDNAs are prepared from mRNA and cloned into double stranded phagemids as follows. 
35 The cDNA library in the double stranded phagemids is then rendered single stranded by treatment 
with an endonuclease, such as the Gene II product of the phage Fl and an exonuclease (Chang et 
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al, 1993, which disclosure is hereby incorporated by reference in its entirety). A biotinylated 
oligonucleotide comprising the sequence of a fragment of a known GENSET cDNA, genomic DNA 
or fragment thereof is hybridized to the single stranded phagemids. Preferably, the fragment 
comprises at least 10, 12, 15, 17, 18, 20, 23, 25, or 28 consecutive nucleotides of a sequence 
5 selected from the group consisting of the sequences of SEQ ID Nos: 1-241 and sequences of clone 
inserts of the deposited clone pool. 

Hybrids between the biotinylated oligonucleotide and phagemids are isolated by incubating 
the hybrids with streptavidin coated paramagnetic beads and retrieving the beads with a magnet 
(Fry et al, 1992, which disclosure is hereby incorporated by reference in its entirety). Thereafter, 

10 the resulting phagemids are released from the beads and converted into double stranded DNA using 
a primer specific for the GENSET cDNA or fragment used to design the biotinylated 
oligonucleotide. Alternatively, protocols such as the Gene Trapper kit (Gibco BRL), which 
disclosure is which disclosure is hereby incorporated by reference in its entirety, may be used. The 
resulting double stranded DNA is transformed into bacteria. Homologous cDNAs to the GENSET 

15 cDNA or fragment thereof sequence are identified by colony PCR or colony hybridization. 

As a chromosome marker 

Chromosomal localization of the cDNA of the present invention were determined using 
information from public and proprietary databases. Table VIII lists the putative chromosomal 
location of the polynucleotides of the present invention. Column one lists the sequence identification 
20 number with the corresponding chromosomal location listed in column two. Thus, the present 
invention also relates to methods and compositions using the chromosomal location of the 
polynucleotides of the invention to construct a human high resolution map or to identify a given 
chromosome in a sample using any techniques known to those skilled in the art including those 
disclosed below. 

25 GENSET polynucleotides may also be mapped to their chromosomal locations using any 

methods or techniques known to those skilled in the art including radiation hybrid (RH) mapping, 
PCR-based mapping and Fluorescence in situ hybridization (FISH) mapping described below. 

Radiation hybrid mapping 

Radiation hybrid (RH) mapping is a somatic cell genetic approach that can be used for high 

30 resolution mapping of the human genome. In this approach, cell lines containing one or more 

human chromosomes are lethally irradiated, breaking each chromosome into fragments whose size 
depends on the radiation dose. These fragments are rescued by fusion with cultured rodent cells, 
yielding subclones containing different fragments of the human genome. This technique is 
described by Benham et al. (1989) and Cox et al, (1990), which disclosures are hereby 

35 incorporated by reference in their entireties. The random and independent nature of the subclones 

permits efficient mapping of any human genome marker. Human DNA isolated from a panel of 80- 
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100 cell lines provides a mapping reagent for ordering GENSET cDNAs or genomic DNAs. In this 
approach, the frequency of breakage between markers is used to measure distance, allowing 
construction of fine resolution maps as has been done using conventional ESTs (Schuler et al., 
1996), which disclosure is hereby incorporated by reference in its entirety. 
5 RH mapping has been used to generate a high-resolution whole genome radiation hybrid 

map of human chromosome 17q22-q25.3 across the genes for growth hormone (GH) and thymidine 
kinase (TK) (Foster et al., 1996), the region surrounding the Gorlin syndrome gene (Obermayr et 
al., 1996), 60 loci covering the entire short arm of chromosome 12 (Raeymaekers et al., 1995), the 
region of human chromosome 22 containing the neurofibromatosis type 2 locus (Frazer et al., 1992) 
10 and 13 loci on the long arm of chromosome 5 (Warrington et al, 1991), which disclosures are 
hereby incorporated by reference in their entireties. 

Mapping of cDNAs to Human Chromosomes using PCR techniques 

GENSET cDNAs and genomic DNAs may be assigned to human chromosomes using PCR 
based methodologies. In such approaches, oligonucleotide primer pairs are designed from the 
15 cDNA sequence to minimize the chance of amplifying through an intron. Preferably, the 
oligonucleotide primers are 18-23 bp in length and are designed for PCR amplification. The 
creation of PCR primers from known sequences is well known to those with skill in the art. For a 
review of PCR technology see Erlich (1992), which disclosure is hereby incorporated by reference 
in its entirety. 

20 The primers are used in polymerase chain reactions (PCR) to amplify templates from total 

human genomic DNA. PCR conditions are as follows: 60 ng of genomic DNA is used as a template 
for PCR with 80 ng of each oligonucleotide primer, 0.6 unit of Taq polymerase, and 1 uCu of a 32 P- 
labeled deoxycytidine triphosphate. The PCR is performed in a microplate thermocycler (Techne) 
under the following conditions: 30 cycles of 94 degree Celsius, 1.4 min; 55 degree Celsius, 2 min; 

25 and 72 degree Celsius, 2 min; with a final extension at 72 degree Celsius for 10 min. The amplified 
products are analyzed on a 6% polyacrylamide sequencing gel and visualized by autoradiography. 
If the length of the resulting PCR product is identical to the distance between the ends of the primer 
sequences in the cDNA from which the primers are derived, then the PCR reaction is repeated with 
DNA templates from two panels of human -rodent somatic cell hybrids, BIOS PCRable DNA (BIOS 

30 Corporation) and NIGMS Human-Rodent Somatic Cell Hybrid Mapping Panel Number 1 (NIGMS, 
Camden, NJ). 

PCR is used to screen a series of somatic cell hybrid cell lines containing defined sets of 
human chromosomes for the presence of a given cDNA or genomic DNA. DNA is isolated from 
the somatic hybrids and used as starting templates for PCR reactions using the primer pairs from the 
35 GENSET cDNAs or genomic DNAs. Only those somatic cell hybrids with chromosomes 

containing the human gene corresponding to the GENSET cDNA or genomic DNA will yield an 
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amplified fragment. The GENSET cDNAs or genomic DNAs are assigned to a chromosome by 
analysis of the segregation pattern of PCR products from the somatic hybrid DNA templates. The 
single human chromosome present in all cell hybrids that give rise to an amplified fragment is the 
chromosome containing that GENSET cDNA or genomic DNA. For a review of techniques and 
5 analysis of results from somatic cell gene mapping experiments, see Ledbetter et al, (1990), which 
disclosure is hereby incorporated by reference in its entirety. 

Mapping of cDNAs to Chromosomes Using Fluorescence in situ Hybridization 

Fluorescence in situ hybridization allows the GENSET cDNA or genomic DNA to be 
mapped to a particular location on a given chromosome. The chromosomes to be used for 

10 fluorescence in situ hybridization techniques may be obtained from a variety of sources including 
cell cultures, tissues, or whole blood. 

In a preferred embodiment, chromosomal localization of a GENSET cDNA or genomic 
DNA is obtained by FISH as described by Cherif et al. (1990), which disclosure is hereby 
incorporated by reference in its entirety. Metaphase chromosomes are prepared from 

15 phytohemagglutinin (PHA)-stimulated blood cell donors. PHA-stimulated lymphocytes from 

healthy males are cultured for 72 h in RPMI-1640 medium. For synchronization, methotrexate (10 
uM) is added for 17 h, followed by addition of 5-bromodeoxyuridine (5-BudR, 0.1 mM) for 6 h. 
Colcemid (1 ug/ml) is added for the last 15 min before harvesting the cells. Cells are collected, 
washed in RPMI, incubated with a hypotonic solution of KC1 (75 mM) at 37 degree Celsius for 15 

20 min and fixed in three changes of methanol :acetic acid (3:1). The cell suspension is dropped onto a 
glass slide and air dried. The GENSET cDNA or genomic DNA is labeled with biotin-16 dUTP by 
nick translation according to the manufacturer's instructions (Bethesda Research Laboratories, 
Bethesda, MD), purified using a Sephadex G-50 column (Pharmacia, Upssala, Sweden) and 
precipitated. Just prior to hybridization, the DNA pellet is dissolved in hybridization buffer (50% 

25 formamide, 2 X SSC, 10% dextran sulfate, 1 mg/ml sonicated salmon sperm DNA, pH 7) and the 
probe is denatured at 70 degree Celsius for 5-10 min. 

Slides kept at -20 degree Celsius are treated for 1 h at 37 degree Celsius with RNase A (100 
ug/ml), rinsed three times in 2 X SSC and dehydrated in an ethanol series. Chromosome 
preparations are denatured in 70% formamide, 2 X SSC for 2 min at 70 degree Celsius, then 

30 dehydrated at 4 degree Celsius. The slides are treated with proteinase K (10 ug/100 ml in 20 mM 
Tris-HCl, 2 mM CaCl 2 ) at 37 degree Celsius for 8 min and dehydrated. The hybridization mixture 
containing the probe is placed on the slide, covered with a coverslip, sealed with rubber cement and 
incubated overnight in a humid chamber at 37 degree Celsius. After hybridization and post- 
hybridization washes, the biotinylated probe is detected by avidin-FITC and amplified with 

35 additional layers of biotinylated goat anti-avidin and avidin-FITC. For chromosomal localization, 
fluorescent R-bands are obtained as previously described (Cherif et al., 1990). The slides are 
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* observed under a LEICA fluorescence microscope (DMRXA). Chromosomes are counterstained 

with propidium iodide and the fluorescent signal of the probe appears as two symmetrical yellow- 
green spots on both chromatids of the fluorescent R-band chromosome (red). Thus, a particular 
GENSET cDNA or genomic DNA may be localized to a particular cytogenetic R-band on a given 
5 chromosome. 

Use of cDNAs to Construct or Expand Chromosome Maps 

Once the GENSET cDNAs or genomic DNAs have been assigned to particular 
chromosomes using any technique known to those skilled in the art those skilled in the art, 
particularly those described herein, they may be utilized to construct a high resolution map of the 

10 chromosomes on which they are located or to identify the chromosomes in a sample. 

Chromosome mapping involves assigning a given unique sequence to a particular 
chromosome as described above. Once the unique sequence has been mapped to a given 
chromosome, it is ordered relative to other unique sequences located on the same chromosome. 
One approach to chromosome mapping utilizes a series of yeast artificial chromosomes (YACs) 

15 bearing several thousand long inserts derived from the chromosomes of the organism from which 
the GENSET cDNAs or genomic DNAs are obtained. This approach is described in Nagaraja et al. 
(1997), which disclosure is hereby incorporated by reference in its entirety. Briefly, in this 
approach each chromosome is broken into overlapping pieces which are inserted into the YAC 
vector. The YAC inserts are screened using PCR or other methods to determine whether they 

20 include the GENSET cDNA or genomic DNA whose position is to be determined. Once an insert 
has been found which includes the GENSET cDNA or genomic DNA, the insert can be analyzed by 
PCR or other methods to determine whether the insert also contains other sequences known to be on 
the chromosome or in the region from which the GENSET cDNA or genomic DNA was derived. 
This process can be repeated for each insert in the YAC library to determine the location of each of 

25 the GENSET cDNA or genomic DNA relative to one another and to other known chromosomal 
markers. In this way, a high resolution map of the distribution of numerous unique markers along 
each of the organisms chromosomes may be obtained. 

Identification of genes associated with hereditary diseases or drug response 

This example illustrates an approach useful for the association of GENSET cDNAs or 
30 genomic DNAs with particular phenotypic characteristics. In this example, a particular GENSET 
cDNA or genomic DNA is used as a test probe to associate that GENSET cDNA or genomic DNA 
with a particular phenotypic characteristic. 

GENSET cDNAs or genomic DNAs are mapped to a particular location on a human 
chromosome using techniques such as those described herein or other techniques known in the art. 
35 A search of Mendelian Inheritance in Man (V. McKusick, Mendelian Inheritance in Man (available 
on line through Johns Hopkins University Welch Medical Library) reveals the region of the human 
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chromosome which contains the GENSET cDNA or genomic DNA to be a very gene rich region 
containing several known genes and several diseases or phenotypes for which genes have not been 
identified. The gene corresponding to this GENSET cDNA or genomic DNA thus becomes an 
immediate candidate for each of these genetic diseases. 
5 Cells from patients with these diseases or phenotypes are isolated and expanded in culture. 

PCR primers from the GENSET cDNA or genomic DNA are used to screen genomic DNA, mRNA 
or cDNA obtained from the patients. GENSET cDNAs or genomic DNAs that are not amplified in 
the patients can be positively associated with a particular disease by further analysis. Alternatively, 
the PCR analysis may yield fragments of different lengths when the samples are derived from an 
10 individual having the phenotype associated with the disease than when the sample is derived from a 
healthy individual, indicating that the gene containing the cDNA may be responsible for the genetic 
disease. 

Uses of polynucleotides in recombinant vectors 

The present invention also relates to recombinant vectors, which include the isolated 
15 polynucleotides of the present invention, or fragments thereof and to host cells recombinant for a 
polynucleotide of the invention, such as the above vectors, as well as to methods of making such 
vectors and host cells and for using them for production of GENSET polypeptides by recombinant 
techniques. 

Recombinant Vectors 

20 The term "vector" is used herein to designate either a circular or a linear DNA or RNA 

molecule, which is either double-stranded or single-stranded, and which comprise at least one 
polynucleotide of interest that is sought to be transferred in a cell host or in a unicellular or 
multicellular host organism. The present invention encompasses a family of recombinant vectors 
that comprise a regulatory polynucleotide and/or a coding polynucleotide derived from either the 

25 GENSET genomic sequence or the cDNA sequence. Generally, a recombinant vector of the 
invention may comprise any of the polynucleotides described herein, including regulatory 
sequences, coding sequences and polynucleotide constructs, as well as any GENSET primer or 
probe as defined herein. 

In a first preferred embodiment, a recombinant vector of the invention is used to amplify the 

30 inserted polynucleotide derived from a GENSET genomic sequence or a GENSET cDNA, for 
example any cDNA selected from the group consisting of sequences of SEQ ID Nos: 1-241, 
sequences of clone inserts of the deposited clone pool, variants and fragments thereof in a suitable 
cell host, this polynucleotide being amplified at every time that the recombinant vector replicates. 
A second preferred embodiment of the recombinant vectors according to the invention 

35 comprises expression vectors comprising either a regulatory polynucleotide or a coding nucleic acid 
of the invention, or both. Within certain embodiments, expression vectors are employed to express 
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a GENSET polypeptide which can be then purified and, for example be used in ligand screening 
assays or as an immunogen in order to raise specific antibodies directed against the GENSET 
protein. In other embodiments, the expression vectors are used for constructing transgenic animals 
and also for gene therapy. Expression requires that appropriate signals are provided in the vectors, 
5 said signals including various regulatory elements, such as enhancers/promoters from both viral and 
mammalian sources that drive expression of the genes of interest in host cells. Dominant drug 
selection markers for establishing permanent, stable cell clones expressing the products are 
generally included in the expression vectors of the invention, as they are elements that link 
expression of the drug selection markers to expression of the polypeptide. 

] 0 More particularly, the present invention relates to expression vectors which include nucleic 

acids encoding a GENSET protein, preferably a GENSET protein with an amino acid sequence 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
included in sequences of SEQ ID Nos: 242-272 and 274-384, and sequences of full-length or 
mature polypeptides encoded by the clone inserts of the deposited clone pool, as well as variants 

15 and fragments thereof. The polynucleotides of the present invention may be used to express an 
encoded protein in a host organism to produce a beneficial effect. In such procedures, the encoded 
protein may be transiently expressed in the host organism or stably expressed in the host organism. 
The encoded protein may have any of the activities described herein. The encoded protein may be a 
protein which the host organism lacks or, alternatively, the encoded protein may augment the 

20 existing levels of the protein in the host organism. 

Some of the elements which can be found in the vectors of the present invention are 
described in further detail in the following sections. 

General features of the expression vectors of the invention 

A recombinant vector according to the invention comprises, but is not limited to, a YAC 
25 (Yeast Artificial Chromosome), a BAC (Bacterial Artificial Chromosome), a phage, a phagemid, a 
cosmid, a plasmid or even a linear DNA molecule which may comprise a chromosomal, non- 
chromosomal, semi -synthetic and synthetic DNA. Such a recombinant vector can comprise a 
transcriptional unit comprising an assembly of: 

(1) a genetic element or elements having a regulatory role in gene expression, for example 
30 promoters or enhancers. Enhancers are cis-acting elements of DNA, usually from about 10 to 300 

bp in length that act on the promoter to increase the transcription. 

(2) a structural or coding sequence which is transcribed into mRNA and eventually 
translated into a polypeptide, said structural or coding sequence being operably linked to the 
regulatory elements described in (1); and 

35 (3) appropriate transcription initiation and termination sequences. Structural units intended 

for use in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
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extracellular secretion of translated protein by a host cell. Alternatively, when a recombinant 
protein is expressed without a leader or transport sequence, it may include a N-terminal residue. 
This residue may or may not be subsequently cleaved from the expressed recombinant protein to 
provide a final product. 

5 Generally, recombinant expression vectors will include origins of replication, selectable 

markers permitting transformation of the host cell, and a promoter derived from a highly expressed 
gene to direct transcription of a downstream structural sequence. The heterologous structural 
sequence is assembled in appropriate phase with translation initiation and termination sequences, 
and preferably a leader sequence capable of directing secretion of the translated protein into the 

10 periplasmic space or the extracellular medium. In a specific embodiment wherein the vector is 
adapted for transfecting and expressing desired sequences in mammalian host cells, preferred 
vectors will comprise an origin of replication in the desired host, a suitable promoter and enhancer, 
and also any necessary ribosome binding sites, polyadenylation signal, splice donor and acceptor 
sites, transcriptional termination sequences, and 5 '-flanking non-transcribed sequences. DNA 

15 sequences derived from the SV40 viral genome, for example SV40 origin, early promoter, 

enhancer, splice and polyadenylation signals may be used to provide the required non-transcribed 
genetic elements. 

The in vivo expression of a GENSET polypeptide of the present invention may be useful in 
order to correct a genetic defect related to the expression of the native gene in a host organism or to 

20 the production of a biologically inactive GENSET protein. Consequently, the present invention also 
comprises recombinant expression vectors mainly designed for the in vivo production of a GENSET 
polypeptide of the present invention by the introduction of the appropriate genetic material in the 
organism or the patient to be treated. This genetic material may be introduced in vitro in a cell that 
has been previously extracted from the organism, the modified cell being subsequently reintroduced 

25 in the said organism, directly in vivo into the appropriate tissue. 

Regulatory Elements 

The suitable promoter regions used in the expression vectors according to the present 
invention are chosen taking into account the cell host in which the heterologous gene has to be 
expressed. The particular promoter employed to control the expression of a nucleic acid sequence 

30 of interest is not believed to be important, so long as it is capable of directing the expression of the 
nucleic acid in the targeted cell. Thus, where a human cell is targeted, it is preferable to position the 
nucleic acid coding region adjacent to and under the control of a promoter that is capable of being 
expressed in a human cell, such as, for example, a human or a viral promoter. 

A suitable promoter may be heterologous with respect to the nucleic acid for which it 

35 controls the expression or alternatively can be endogenous to the native polynucleotide containing 
the coding sequence to be expressed. Additionally, the promoter is generally heterologous with 
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respect to the recombinant vector sequences within which the construct promoter/coding sequence 
has been inserted. 

Promoter regions can be selected from any desired gene using, for example, CAT 
(chloramphenicol transferase) vectors and more preferably pKK232-8 and pCM7 vectors. 
5 Preferred bacterial promoters are the Lad, LacZ, the T3 or T7 bacteriophage RNA 

polymerase promoters, the gpt, lambda PR, PL and trp promoters (EP 0036776), the polyhedrin 
promoter, or the plO protein promoter from baculovirus (Kit Novagen), (Smith et a/., 1983; 
O'Reilly et al., 1992), which disclosures are hereby incorporated by reference in their entireties, the 
lambda PR promoter or also the trc promoter. 

10 Eukaryotic promoters include CMV immediate early, HSV thymidine kinase, early and late 

SV40, LTRs from retrovirus, and mouse metallothionein-L. Selection of a convenient vector and 
promoter is well within the level of ordinary skill in the art. The choice of a promoter is well within 
the ability of a person skilled in the field of genetic engineering. For example, one may refer to the 
book of Sambrook et aL, (1989) or also to the procedures described by Fuller et al. 9 (1996), which 

15 disclosures are hereby incorporated by reference in their entireties. 

Other regulatory elements 

Where a cDNA insert is employed, one will typically desire to include a polyadenylation 
signal to effect proper polyadenylation of the gene transcript. The nature of the polyadenylation 
signal is not believed to be crucial to the successful practice of the invention, and any such sequence 
20 may be employed such as human growth hormone and SV40 polyadenylation signals. Also 

contemplated as an element of the expression cassette is a terminator. These elements can serve to 
enhance message levels and to minimize read through from the cassette into other sequences. 

Selectable Markers 

Selectable markers confer an identifiable change to the cell permitting easy identification of 
25 cells containing the expression construct. The selectable marker genes for selection of transformed 
host cells are preferably dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, 
TRP1 for S. cerevisiae or tetracycline, rifampicin or ampicillin resistance in E. Coli, or levan 
saccharase for mycobacteria, this latter marker being a negative selection marker. 

Preferred Vectors 
30 Bacterial vectors 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and a bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of pBR322 (ATCC 37017). Such commercial 
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vectors include, for example, pKK223-3 (Pharmacia, Uppsala, Sweden), and pGEMl (Promega 
Biotec, Madison, WI, USA). 

Large numbers of other suitable vectors are known to those of skill in the art, and 
commercially available, such as the following bacterial vectors: pQE70, pQE60, pQE-9 (Qiagen), 
5 pbs, pDIO, phagescript, psiX174, pbluescript SK, pbsks, pNH8A, pNH16A, pNH18A, pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); pWLNEO, pSV2CAT, 
pOG44, pXTl, pSG (Stratagene); pSVK3, pBPV, pMSG, pSVL (Pharmacia); pQE-30 
(QIAexpress). 

Bacteriophage vectors 

10 The PI bacteriophage vector may contain large inserts ranging from about 80 to about 100 

kb. The construction of PI bacteriophage vectors such as pi 58 or pl58/neo8 are notably described 
by Sternberg (1992, 1994), which disclosure is hereby incorporated by reference in its entirety. 
Recombinant PI clones comprising GENSET nucleotide sequences may be designed for inserting 
large polynucleotides of more than 40 kb (See Linton et a!., 1993), which disclosure is hereby 

1 5 incorporated by reference in its entirety. To generate PI DNA for transgenic experiments, a 

preferred protocol is the protocol described by McCormick et aL (1994), which disclosure is hereby 
incorporated by reference in its entirety. Briefly, E. coli (preferably strain NS3529) harboring the 
PI plasmid are grown overnight in a suitable broth medium containing 25 ug/ml of kanamycin. 
The PI DNA is prepared from the E. coli by alkaline lysis using the Qiagen Plasmid Maxi kit 

20 (Qiagen, Chatsworth, CA, USA), according to the manufacturer's instructions. The PI DNA is 
purified from the bacterial lysate on two Qiagen-tip 500 columns, using the washing and elution 
buffers contained in the kit. A phenol/chloroform extraction is then performed before precipitating 
the DNA with 70% ethanol. After solubilizing the DNA in TE (10 mM Tris-HCl, pH 7.4, 1 mM 
EDTA), the concentration of the DNA is assessed by spectrophotometry. 

25 When the goal is to express a PI clone comprising GENSET nucleotide sequences in a 

transgenic animal, typically in transgenic mice, it is desirable to remove vector sequences from the 
PI DNA fragment, for example by cleaving the PI DNA at rare-cutting sites within the PI 
polylinker (Sfil, Notl or Sail). The PI insert is then purified from vector sequences on a pulsed- 
field agarose gel, using methods similar to those originally reported for the isolation of DNA from 

30 YACs (See e. g., Schedl et aL, 1 993a; Peterson et aL, 1993), which disclosures are hereby 

incorporated by reference in their entireties. At this stage, the resulting purified insert DNA can be 
concentrated, if necessary, on a Millipore Ultrafree-MC Filter Unit (Millipore, Bedford, MA, USA 
- 30,000 molecular weight limit) and then dialyzed against microinjection buffer (10 mM Tris-HCl, 
pH 7.4; 250 uM EDTA) containing 100 mM NaCl, 30 uM spermine, 70 uM spermidine on a 

35 microdyalisis membrane (type VS, 0.025 uM from Millipore). The intactness of the purified PI 
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DNA insert is assessed by electrophoresis on 1% agarose (Sea Kern GTG; FMC Bio-products) 
pulse-field gel and staining with ethidium bromide. 



Viral vectors 

In one specific embodiment, the vector is derived from an adenovirus. Preferred adenovirus 
5 vectors according to the invention are those described by Feldman and Steg (1996), or Ohno et aL, 
(1994), which disclosures are hereby incorporated by reference in their entireties. Another 
preferred recombinant adenovirus according to this specific embodiment of the present invention is 
the human adenovirus type 2 or 5 (Ad 2 or Ad 5) or an adenovirus of animal origin (French patent 
application No. FR-93.05954), which disclosure is hereby incorporated by reference in its entirety. 

10 Retrovirus vectors and adeno-associated virus vectors are generally understood to be the 

recombinant gene delivery systems of choice for the transfer of exogenous polynucleotides in vivo , 
particularly to mammals, including humans. These vectors provide efficient delivery of genes into 
cells, and the transferred nucleic acids are stably integrated into the chromosomal DNA of the host. 
Particularly preferred retroviruses for the preparation or construction of retroviral in vitro or in vitro 

15 gene delivery vehicles of the present invention include retroviruses selected from the group 

consisting of Mink-Cell Focus Inducing Virus, Murine Sarcoma Virus, Reticuloendotheliosis virus 
and Rous Sarcoma virus. Particularly preferred Murine Leukemia Viruses include the 4070A and 
the 1504A viruses, Abelson (ATCC No VR-999), Friend (ATCC No VR-245), Gross (ATCC No 
VR-590), Rauscher (ATCC No VR-998) and Moloney Murine Leukemia Virus (ATCC No VR- 

20 190; PCT Application No WO 94/24298). Particularly preferred Rous Sarcoma Viruses include 
Bryan high titer (ATCC Nos VR-334, VR-657, VR-726, VR-659 and VR-728). Other preferred 
retroviral vectors are those described in Roth et aL (1996), PCT Application No WO 93/25234, 
PCT Application No WO 94/ 06920, Roux et aL, (1989), Julan et aL, (1992), and Neda et aL, 
(1991), which disclosures are hereby incorporated by reference in their entireties. 

25 Yet another viral vector system that is contemplated by the invention comprises the adeno- 

associated virus (AAV). The adeno-associated virus is a naturally occurring defective virus that 
requires another virus, such as an adenovirus or a herpes virus, as a helper virus for efficient 
replication and a productive life cycle (Muzyczka et aL, 1992), which disclosure is hereby 
incorporated by reference in its entirety. It is also one of the few viruses that may integrate its DNA 

30 into non-dividing cells, and exhibits a high frequency of stable integration (Flotte et aL 1992; 
Samulski et aL, 1989; McLaughlin et aL, 1989), which disclosures are hereby incorporated by 
reference in their entireties. One advantageous feature of AAV derives from its reduced efficacy 
for transducing primary cells relative to transformed cells. 

BAC vectors 

35 The bacterial artificial chromosome (BAC) cloning system (Shizuya et aL, 1992), which 

disclosure is hereby incorporated by reference in its entirety, has been developed to stably maintain 
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large fragments of genomic DNA (100-300 kb) in E. coli. A preferred BAC vector comprises a 
pBeloBACl 1 vector that has been described by Kim et aL (1996), which disclosure is hereby 
incorporated by reference in its entirety. BAC libraries are prepared with this vector using size- 
selected genomic DNA that has been partially digested using enzymes that permit ligation into 
5 either the Bam HI or Hindlll sites in the vector. Flanking these cloning sites are T7 and SP6 RNA 
polymerase transcription initiation sites that can be used to generate end probes by either RNA 
transcription or PCR methods. After the construction of a BAC library in E. coli, BAC DNA is 
purified from the host cell as a supercoiled circle. Converting these circular molecules into a linear 
form precedes both size determination and introduction of the BACs into recipient cells. The 
10 cloning site is flanked by two Not I sites, permitting cloned segments to be excised from the vector 
by Not I digestion. Alternatively, the DNA insert contained in the pBeloBACl 1 vector may be 
linearized by treatment of the BAC vector with the commercially available enzyme lambda 
terminase that leads to the cleavage at the unique cosN site, but this cleavage method results in a 
full length BAC clone containing both the insert DNA and the BAC sequences. 

15 Baculovirus: 

Another specific suitable host vector system is the pVL 1392/1 393 baculovirus transfer 
vector (Pharmingen) that is used to transfect the SF9 cell line (ATCC No. CRL 171 1) which is 
derived from Spodoptera frugiperda. Other suitable vectors for the expression of the GENSET 
polypeptide of the present invention in a baculovirus expression system include those described by 
20 Chai et aL, (1993), Vlasak et aL, (1983), and Lenhard et aL, (1996), which disclosures are hereby 
incorporated by reference in their entireties. 

Delivery Of The Recombinant Vectors: 

To effect expression of the polynucleotides and polynucleotide constructs of the invention, 

these constructs must be delivered into a cell. This delivery may be accomplished in vitro, as in 
25 laboratory procedures for transforming cell lines, or in vivo or ex vivo, as in the treatment of certain 

diseases states. One mechanism is viral infection where the expression construct is encapsulated in 

an infectious viral particle. 

Several non-viral methods for the transfer of polynucleotides into cultured mammalian cells 

are also contemplated by the present invention, and include, without being limited to, calcium 
30 phosphate precipitation (Graham et aL, 1973; Chen et aL, 1987); DEAE-dextran (Gopal, 1985); 

electroporation (Tur-Kaspa et aL, 1986; Potter et aL, 1984); direct microinjection (Harland et aL, 

1985); DNA-loaded liposomes (Nicolau et aL, 1982; Fraley et aL, 1979); and receptor-mediated 

transfection. (Wu and Wu, 1987, 1988), which disclosures are hereby incorporated by reference in 

their entireties. Some of these techniques may be successfully adapted for in vivo or ex vivo use. 
35 Once the expression polynucleotide has been delivered into the cell, it may be stably 

integrated into the genome of the recipient cell. This integration may be in the cognate location and 
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orientation via homologous recombination (gene replacement) or it may be integrated in a random, 
non-specific location (gene augmentation). In yet further embodiments, the nucleic acid may be 
stably maintained in the cell as a separate, episomal segment of DNA. Such nucleic acid segments 
or "episomes" encode sequences sufficient to permit maintenance and replication independent of or 
5 in synchronization with the host cell cycle. 

One specific embodiment for a method for delivering a protein or peptide to the interior of a 
cell of a vertebrate in vivo comprises the step of introducing a preparation comprising a 
physiologically acceptable carrier and a naked polynucleotide operatively coding for the 
polypeptide of interest into the interstitial space of a tissue comprising the cell, whereby the naked 
10 polynucleotide is taken up into the interior of the cell and has a physiological effect. This is 
particularly applicable for transfer in vitro but it may be applied to in vivo as well. 

Compositions for use in vitro and in vivo comprising a "naked" polynucleotide are 
described in PCT application No. WO 90/1 1092 (Vical Inc.) and also in PCT application No. WO 
95/1 1307 (Institut Pasteur, INSERM, Universite d'Ottawa) as well as in the articles of Tacson et aL 
1 5 (1 996) and of Huygen et aL, (1996), which disclosures are hereby incorporated by reference in their 
entireties. 

In still another embodiment of the invention, the transfer of a naked polynucleotide of the 
invention, including a polynucleotide construct of the invention, into cells may be proceeded with a 
particle bombardment (biolistic), said particles being DNA-coated microprojectiles accelerated to a 
20 high velocity allowing them to pierce cell membranes and enter cells without killing them, such as 
described by Klein et aL, (1987), which disclosure is hereby incorporated by reference in its 
entirety. 

In a further embodiment, the polynucleotide of the invention may be entrapped in a 
liposome (Ghosh and Bacchawat, 1991; Wong et aL, 1980; Nicolau et aL, 1987), which disclosures 
25 are hereby incorporated by reference in their entireties. 

In a specific embodiment, the invention provides a composition for the in vivo production of 
the GENSET protein or polypeptide described herein. It comprises a naked polynucleotide 
operatively coding for this polypeptide, in solution in a physiologically acceptable carrier, and 
suitable for introduction into a tissue to cause cells of the tissue to express the said protein or 
30 polypeptide. 

The amount of vector to be injected to the desired host organism varies according to the site 
of injection. As an indicative dose, it will be injected between 0,1 and 100 \a% of the vector in an 
animal body, preferably a mammal body, for example a mouse body. 

In another embodiment of the vector according to the invention, it may be introduced in 
35 vitro in a host cell, preferably in a host cell previously harvested from the animal to be treated and 
more preferably a somatic cell such as a muscle cell. In a subsequent step, the cell that has been 
transformed with the vector coding for the desired GENSET polypeptide or the desired fragment 
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thereof is reintroduced into the animal body in order to deliver the recombinant protein within the 
body either locally or systemically. 



Secretion vectors 

Some of the GENSET cDNAs or genomic DNAs of the invention may also be used to 
5 construct secretion vectors capable of directing the secretion of the proteins encoded by genes 
inserted in the vectors. Such secretion vectors may facilitate the purification or enrichment of the 
proteins encoded by genes inserted therein by reducing the number of background proteins from 
which the desired protein must be purified or enriched. Exemplary secretion vectors are described 
below. 

10 The secretion vectors of the present invention include a promoter capable of directing gene 

expression in the host cell, tissue, or organism of interest. Such promoters include the Rous 
Sarcoma Virus promoter, the SV40 promoter, the human cytomegalovirus promoter, and other 
promoters familiar to those skilled in the art. 

A signal sequence from a polynucleotide of the invention, preferably a signal sequences 

15 selected from the group of signal sequences of SEQ ID Nos: 1-31 and 33-143 and signal sequences 
of clone inserts of the deposited clone pool is operably linked to the promoter such that the mRNA 
transcribed from the promoter will direct the translation of the signal peptide. The host cell, tissue, 
or organism may be any cell, tissue, or organism which recognizes the signal peptide encoded by 
the signal sequence in the GENSET cDNA or genomic DNA. Suitable hosts include mammalian 

20 cells, tissues or organisms, avian cells, tissues, or organisms, insect cells, tissues or organisms, or 
yeast. 

In addition, the secretion vector contains cloning sites for inserting genes encoding the 
proteins which are to be secreted. The cloning sites facilitate the cloning of the insert gene in frame 
with the signal sequence such that a fusion protein in which the signal peptide is fused to the protein 

25 encoded by the inserted gene is expressed from the mRNA transcribed from the promoter. The 
signal peptide directs the extracellular secretion of the fusion protein. 

The secretion vector may be DNA or RNA and may integrate into the chromosome of the 
host, be stably maintained as an extrachromosomal replicon in the host, be an artificial 
chromosome, or be transiently present in the host. Preferably, the secretion vector is maintained in 

30 multiple copies in each host cell. As used herein, multiple copies means at least 2,5, 10, 20, 25, 50 
or more than 50 copies per cell. In some embodiments, the multiple copies are maintained 
extrachromosomally. In other embodiments, the multiple copies result from amplification of a 
chromosomal sequence. 

Many nucleic acid backbones suitable for use as secretion vectors are known to those 

35 skilled in the art, including retroviral vectors, SV40 vectors, Bovine Papilloma Virus vectors, yeast 
integrating plasmids, yeast episomal plasmids, yeast artificial chromosomes, human artificial 
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chromosomes, P element vectors, baculovirus vectors, or bacterial plasmids capable of being 
transiently introduced into the host. 

The secretion vector may also contain a polyA signal such that the polyA signal is located 
downstream of the gene inserted into the secretion vector. 
5 After the gene encoding the protein for which secretion is desired is inserted into the 

secretion vector, the secretion vector is introduced into the host cell, tissue, or organism using 
calcium phosphate precipitation, DEAE-Dextran, electroporation, liposome-mediated transfection, 
viral particles or as naked DNA. The protein encoded by the inserted gene is then purified or 
enriched from the supernatant using conventional techniques such as ammonium sulfate 

10 precipitation, immunoprecipitation, immunochromatography, size exclusion chromatography, ion 
exchange chromatography, and hplc. Alternatively, the secreted protein may be in a sufficiently 
enriched or pure state in the supernatant or growth media of the host to permit it to be used for its 
intended purpose without further enrichment. 

The signal sequences may also be inserted into vectors designed for gene therapy. In such 

15 vectors, the signal sequence is operably linked to a promoter such that mRNA transcribed from the 
promoter encodes the signal peptide. A cloning site is located downstream of the signal sequence 
such that a gene encoding a protein whose secretion is desired may readily be inserted into the 
vector and fused to the signal sequence. The vector is introduced into an appropriate host cell. The 
protein expressed from the promoter is secreted extracellularly, thereby producing a therapeutic 

20 effect. 

Cell Hosts 

Another object of the invention comprises a host cell that has been transformed or 
transfected with one of the polynucleotides described herein, and in particular a polynucleotide 
either comprising a GENSET regulatory polynucleotide or the polynucleotide coding for a 

25 GENSET polypeptide. Also included are host cells that are transformed (prokaryotic cells) or that 
are transfected (eukaryotic cells) with a recombinant vector such as one of those described above. 
However, the cell hosts of the present invention can comprise any of the polynucleotides of the 
present invention. In a preferred embodiment, host cells contain a polynucleotide sequence 
comprising a sequence selected from the group consisting of sequences of SEQ ID Nos: 1-241, 

30 sequences of clone inserts of the deposited clone pool, variants and fragments thereof. Preferred 
host cells used as recipients for the expression vectors of the invention are the following: 

a) Prokaryotic host cells: Escherichia coli strains (I.E.DH5-<x strain), Bacillus subtilis, 
Salmonella typhimurium, and strains from species like Pseudomonas, Streptomyces and 
Staphylococcus . 

35 b) Eukaryotic host cells: HeLa cells (ATCC No.CCL2; No.CCL2.1; No.CCL2.2), Cv 1 cells 

(ATCC No.CCL70), COS cells (ATCC No.CRL1650; No.CRL1651), Sf-9 cells (ATCC 
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No.CRL171 1), C127 cells (ATCC No. CRL-1804), 3T3 (ATCC No. CRL-6361), CHO (ATCC No. 
CCL-61), human kidney 293. (ATCC No. 45504; No. CRL-1573) and BHK (ECACC No. 
84100501; No. 84111301). 

c) Other mammalian host cells. 
5 The present invention also encompasses primary, secondary, and immortalized 

homologously recombinant host cells of vertebrate origin, preferably mammalian origin and 
particularly human origin, that have been engineered to: a) insert exogenous (heterologous) 
polynucleotides into the endogenous chromosomal DNA of a targeted gene, b) delete endogenous 
chromosomal DNA, and/or c) replace endogenous chromosomal DNA with exogenous 

10 polynucleotides. Insertions, deletions, and/or replacements of polynucleotide sequences may be to 
the coding sequences of the targeted gene and/or to regulatory regions, such as promoter and 
enhancer sequences, operably associated with the targeted gene. 

In addition to encompassing host cells containing the vector constructs discussed herein, the 
invention also encompasses primary, secondary, and immortalized host cells of vertebrate origin, 

15 particularly mammalian origin, that have been engineered to delete or replace endogenous genetic 
material (e.g., coding sequence), and/or to include genetic material (e.g., heterologous 
polynucleotide sequences) that is operably associated with the polynucleotides of the invention, and 
which activates, alters, and/or amplifies endogenous polynucleotides. For example, techniques 
known in the art may be used to operably associate heterologous control regions (e.g., promoter 

20 and/or enhancer) and endogenous polynucleotide sequences via homologous recombination, see, 
e.g., U.S. Patent No. 5,641,670, issued June 24, 1997; International Publication No. WO 96/2941 1, 
published September 26, 1996; International Publication No. WO 94/12650, published August 4, 
1994; Koller et al. 9 (1989); and Zijlstra et al. (1989) (The disclosures of each of which are 
incorporated by reference in their entireties). 

25 The present invention further relates to a method of making a homologously recombinant 

host cell in vitro or in vivo, wherein the expression of a targeted gene not normally expressed in the 
cell is altered. Preferably the alteration causes expression of the targeted gene under normal growth 
conditions or under conditions suitable for producing the polypeptide encoded by the targeted gene. 
The method comprises the steps of: (a) transfecting the cell in vitro or in vivo with a polynucleotide 

30 construct, said polynucleotide construct comprising; (i) a targeting sequence; (ii) a regulatory 
sequence and/or a coding sequence; and (iii) an unpaired splice donor site, if necessary, thereby 
producing a transfected cell; and (b) maintaining the transfected cell in vitro or in vivo under 
conditions appropriate for homologous recombination. 

The present invention further relates to a method of altering the expression of a targeted 

35 gene in a cell in vitro or in vivo wherein the gene is not normally expressed in the cell, comprising 
the steps of: (a) transfecting the cell in vitro or in vivo with a polynucleotide construct, said 
polynucleotide construct comprising: (i) a targeting sequence; (ii) a regulatory sequence and/or a 
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coding sequence; and (iii) an unpaired splice donor site, if necessary, thereby producing a 
transfected cell; and (b) maintaining the transfected cell in vitro or in vivo under conditions 
appropriate for homologous recombination, thereby producing a homologously recombinant cell; 
and (c) maintaining the homologously recombinant cell in vitro or in vivo under conditions 
5 appropriate for expression of the gene. 

The present invention further relates to a method of making a polypeptide of the present 
invention by altering the expression of a targeted endogenous gene in a cell in vitro or in vivo 
wherein the gene is not normally expressed in the cell, comprising the steps of: a) transfecting the 
cell in vitro with a polynucleotide construct, said polynucleotide construct comprising: (i) a 

10 targeting sequence; (ii) a regulatory sequence and/or a coding sequence; and (iii) an unpaired splice 
donor site, if necessary, thereby producing a transfected cell; (b) maintaining the transfected cell in 
vitro or in vivo under conditions appropriate for homologous recombination, thereby producing a 
homologously recombinant cell; and c) maintaining the homologously recombinant cell in vitro or 
in vivo under conditions appropriate for expression of the gene thereby making the polypeptide. 

1 5 The present invention further relates to a polynucleotide construct which alters the 

expression of a targeted gene in a cell type in which the gene is not normally expressed. This 
occurs when the polynucleotide construct is inserted into the chromosomal DNA of the target cell, 
wherein said polynucleotide construct comprises: a) a targeting sequence; b) a regulatory sequence 
and/or coding sequence; and c) an unpaired splice-donor site, if necessary. Further included are a 

20 polynucleotide construct, as described above, wherein said polynucleotide construct further 
comprises a polynucleotide which encodes a polypeptide and is in-frame with the targeted 
endogenous gene after homologous recombination with chromosomal DNA. 

The compositions may be produced, and methods performed, by techniques known in the 
art, such as those described in U.S. Patent Nos: 6,054,288; 6,048,729; 6,048,724; 6,048,524; 

25 5,994,127; 5,968,502; 5,965,125; 5,869,239; 5,817,789; 5,783,385; 5,733,761; 5,641,670; 
5,580,734 ; International Publication Nos:W096/2941 1, WO 94/12650; and scientific articles 
described by Koller et al. 9 (1994). (The disclosures of each of which are incorporated by reference 
in their entireties). 

The GENSET gene expression in mammalian cells, preferably human cells, may be 
30 rendered defective, or alternatively may be altered by replacing the endogenous GENSET gene in 
the genome of an animal cell by a GENSET polynucleotide according to the invention. These 
genetic alterations may be generated by homologous recombination using previously described 
specific polynucleotide constructs. 

Mammal zygotes, such as murine zygotes may be used as cell hosts. For example, murine 
35 zygotes may undergo microinjection with a purified DNA molecule of interest, for example a 

purified DNA molecule that has previously been adjusted to a concentration ranging from 1 ng/ml - 
for BAC inserts- to 3 ng/^l -for PI bacteriophage inserts- in 10 mM Tris-HCl, pH 7.4, 250 
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EDTA containing 100 mM NaCI, 30 ^iM spermine, and70 \iM spermidine. When the DNA to be 
microinjected has a large size, polyamines and high salt concentrations can be used in order to avoid 
mechanical breakage of this DNA, as described by Schedl et al (1993b), which disclosure is hereby 
incorporated by reference in its entirety. 
5 Any one of the polynucleotides of the invention, including the Polynucleotide constructs 

described herein, may be introduced in an embryonic stem (ES) cell line, preferably a mouse ES 
cell line. ES cell lines are derived from pluripotent, uncommitted cells of the inner cell mass of pre- 
implantation blastocysts. Preferred ES cell lines are the following: ES-E14TG2a (ATCC No.CRL- 
1 821), ES-D3 (ATCC No.CRL1934 and No. CRL-1 1632), YS001 (ATCC No. CRL-1 1776), 36.5 

1 0 (ATCC No. CRL-1 1116). ES cells are maintained in an uncommitted state by culture in the 
presence of growth-inhibited feeder cells which provide the appropriate signals to preserve this 
embryonic phenotype and serve as a matrix for ES cell adherence. Preferred feeder cells are 
primary embryonic fibroblasts that are established from tissue of day 13- day 14 embryos of 
virtually any mouse strain, that are maintained in culture, such as described by Abbondanzo et al. 

15 (1 993) and are growth-inhibited by irradiation, such as described by Robertson (1987), or by the 
presence of an inhibitory concentration of LEF, such as described by Pease and Williams (1990), 
which disclosures are hereby incorporated by reference in their entireties. 

The constructs in the host cells can be used in a conventional manner to produce the gene 
product encoded by the recombinant sequence. 

20 Following transformation of a suitable host and growth of the host to an appropriate cell 

density, the selected promoter is induced by appropriate means, such as temperature shift or 
chemical induction, and cells are cultivated for an additional period. Cells are typically harvested 
by centrifugation, disrupted by physical or chemical means, and the resulting crude extract retained 
for further purification. Microbial cells employed in the expression of proteins can be disrupted by 

25 any convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or use of 
cell lysing agents. Such methods are well known by the skilled artisan. 

Transgenic Animals 

The terms " transgenic animals " or " host animals " are used herein to designate animals that 
have their genome genetically and artificially manipulated so as to include one of the nucleic acids 

30 according to the invention. Preferred animals are non-human mammals and include those 

belonging to a genus selected from Mus (e.g. mice), Rattus (e.g. rats) and Oryctogalus (e.g. rabbits) 
which have their genome artificially and genetically altered by the insertion of a nucleic acid 
according to the invention. In one embodiment, the invention encompasses non-human host 
mammals and animals comprising a recombinant vector of the invention or a GENSET gene 

35 disrupted by homologous recombination with a knock out vector. 
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The transgenic animals of the invention all include within a plurality of their cells a cloned 
recombinant or synthetic DNA sequence, more specifically one of the purified or isolated nucleic 
acids comprising a GENSET coding sequence, a GENSET regulatory polynucleotide, a 
polynucleotide construct, or a DNA sequence encoding an antisense polynucleotide such as 
5 described in the present specification. 

Generally, a transgenic animal according the present invention comprises any of the 
polynucleotides, the recombinant vectors and the cell hosts described in the present invention. In a 
first preferred embodiment, these transgenic animals may be good experimental models in order to 
study the diverse pathologies related to the dysregulation of the expression of a given GENSET 
10 gene, in particular the transgenic animals containing within their genome one or several copies of an 
inserted polynucleotide encoding a native GENSET protein, or alternatively a mutant GENSET 
protein. 

In a second preferred embodiment, these transgenic animals may express a desired 
polypeptide of interest under the control of the regulatory polynucleotides of the GENSET gene, 

15 leading to high yields in the synthesis of this protein of interest, and eventually to tissue specific 
expression of the protein of interest. 

In a third preferred embodiment, these transgenic animals may express a desired 
polypeptide of interest fused to a GENSET signal peptide sequence, leading to the secretion of the 
fusion (chimeric) polypeptide. 

20 The design of the transgenic animals of the invention may be made according to the 

conventional techniques well known from the one skilled in the art. For more details regarding the 
production of transgenic animals, and specifically transgenic mice, it may be referred to US Patents 
Nos 4,873,191, issued Oct. 10, 1989; 5,464,764 issued Nov 7, 1995; and 5,789,215, issued Aug 4, 
1998; these documents being herein incorporated by reference to disclose methods producing 

25 transgenic mice. 

Transgenic animals of the present invention are produced by the application of procedures 
which result in an animal with a genome that has incorporated exogenous genetic material. The 
procedure involves obtaining the genetic material which encodes either a GENSET coding 
sequence, a GENSET regulatory polynucleotide or a DNA sequence encoding a GENSET antisense 

30 polynucleotide, or a portion thereof, such as described in the present specification. A recombinant 
polynucleotide of the invention is inserted into an embryonic or ES stem cell line. The insertion is 
preferably made using electroporation, such as described by Thomas et al. (1987), which disclosure 
is hereby incorporated by reference in its entirety. The cells subjected to electroporation are 
screened (e.g. by selection via selectable markers, by PCR or by Southern blot analysis) to find 

35 positive cells which have integrated the exogenous recombinant polynucleotide into their genome, 
preferably via an homologous recombination event. An illustrative positive-negative selection 
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procedure that may be used according to the invention is described by Mansour et al. (1988), which 
disclosure is hereby incorporated by reference in its entirety. 

The positive cells are then isolated, cloned and injected into 3.5 days old blastocysts from 
mice, such as described by Bradley (1987), which disclosure is hereby incorporated by reference in 
5 its entirety. The blastocysts are then inserted into a female host animal and allowed to grow to 
term. Alternatively, the positive ES cells are brought into contact with embryos at the 2.5 days old 
8-16 cell stage (morulae) such as described by Wood et al. (1993), or by Nagy et al (1993), which 
disclosures are hereby incorporated by reference in their entireties, the ES cells being internalized to 
colonize extensively the blastocyst including the cells which will give rise to the germ line. 
10 The offspring of the female host are tested to determine which animals are transgenic e.g. 

include the inserted exogenous DNA sequence and which ones are wild type. 

Thus, the present invention also concerns a transgenic animal containing a nucleic acid, a 
recombinant expression vector or a recombinant host cell according to the invention. 

Recombinant Cell Lines Derived From The Transgenic Animals Of The Invention: 

15 A further object of the invention comprises recombinant host cells obtained from a 

transgenic animal described herein. In one embodiment the invention encompasses cells derived 
from non-human host mammals and animals comprising a recombinant vector of the invention or a 
GENSET gene disrupted by homologous recombination with a knock out vector. 

Recombinant cell lines may be established in vitro from cells obtained from any tissue of a 

20 transgenic animal according to the invention, for example by transfection of primary cell cultures 
with vectors expressing o/ic-genes such as SV40 large T antigen, as described by Chou (1989), and 
Shay et al. (1991), which disclosures are hereby incorporated by reference in their entireties. 

Uses of pol ypeptjdes of the invention 

Proteins containing multimerization domains 

25 The invention relates to compositions and methods using proteins of the invention 

containing a multimerization domains such as a leucine zipper or a helix loop helix domain. 

Proteins of the invention containing a leucine zipper domain, are herein referred to as LZP, 
such as the ones described in this section and those containing a leucine zipper domain as shown on 
Table VI, or parts thereof, preferably fragments comprising a leucine zipper domain, or derivative 

30 thereof to mediate multimerization of proteins of interest. 

The leucine zipper consists of a periodic repetition of leucine residues at every seventh, 
covering a distance spanning eight helical turns. The segments containing these periodic arrays of 
leucine residues appear to exist in an alpha-helical conformation, and the leucine side chains 
extending from one alpha-helix interact with those from a similar alpha helix of a second 
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polypeptide, facilitating dimerization. The structure formed by cooperation of these two regions 
forms a coiled coil (O'Shea E.K., Rutkowski R., Kim P.S. Science 243:538-542. ,1989). 

Leucine-zippers contribute to targeting of various proteins (eg. glucose transporters, Asano, 
et al., J, Biol Chem., 267, 19636-19641 (1992)) and permit dimerization of various cytoplasmic 
5 hormone receptors and enzymes (Forman, et al., Mol Endocrinol, 3, 1610-1626 (1989)). Leucine 
zippers are also a common feature of protein transcription factors, where they permit homo- or 
heterodimerization resulting in tight binding to DNA strands (for reviews, see Abel, et al., Nature 
341, 24-25 (1989); Jones, et al., Cell 61, 9-1 1 (1990); Lamb, et al., Trends in Biochemical Sciences 
16,417-422(1991)). 

10 Leucine zippers have been shown to be useful tools in several areas of biotechnology, 

especially in protein engineering, where their ability to mediate homo-dimerization or hetero- 
dimerization has found several applications. For example, Bosslet et al have described the use of a 
pair of leucine zipper for in vitro diagnosis, in particular for the immunochemical detection and 
determination of an analyte in a biological liquid ( US patent 5,643,731) / Tso et al have used 

15 leucine zippers for producing bispecific antibody heterodimers (US patent 5,932,448) / Methods of 
preparing soluble oligomeric proteins using leucine zippers have been described by Conrad et al 
(US patent 5,965,712), Ciardelli et al (US patent 5,837,816), Spriggs et al (WO9410308) / Leucine 
zipper forming sequences have been used by Pelletier et al in protein fragment complementation 
assays to detect biomolecular interactions (WO9834120). Because of their usefulness in 

20 biotechnology, it is thus highly interesting to isolate new leucine zipper domains. 

The multimerization activity of proteins containing leucine zipper domains may be assayed 
using any of the assays known to those skilled in the art including circular dichroism spectrum and 
thermal melting analyses as described in US patent 5,942,433. Alternatively, the leucine zipper 
motif in LZP could be used by those skilled in art as a "bait protein" in a well established yeast 

25 double hybridization system to identify its interacting protein partners in vivo from cDNA library 
derived from different tissues or cell types of a given organism. Alternatively, LZP or part thereof 
could be used by those skilled in art in mammalian cell transfection experiments. When fused to a 
suitable peptide tag such as [His] 6 tag in a protein expression vector and introduced into culture 
cells, this expressed fusion protein can be immunoprecipitated with its potential interacting proteins 

30 by using anti-tag peptide antibody. This method could be chosen either to identify the associated 
partner or to confirm the results obtained by other methods such as those just mentioned. 

In a preferred embodiment, the invention relates to compositions and methods of using LZP 
or part thereof for preparing soluble multimeric proteins, which consist in multimers of fusion 
proteins containing a leucine zipper fused to a protein of interest, using any technique known to 

35 those skilled in the art including those described in international patent W094 10308, which 
disclosure is hereby incorporated by reference in its entirety. In another preferred embodiment, 
LZP or derivative thereof is used to produce bispecific antibody heterodimers as described in US 
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patent 5,932,448, which disclosure is hereby incorporated by reference in its entirety. Briefly, 
leucine zippers capable of forming heterodimers are respectively linked to epitope binding 
components with different specificities. Bispecific antibodies are formed by pairwise association of 
the leucine zippers, forming an heterodimer which links two distinct epitope binding components. 
5 In still another preferred embodiment, LZP or part thereof or derivative thereof is used for detection 
and determination of an analyte in a biological liquid as described in US patent 5,643,731, which 
disclosure is hereby incorporated by reference in its entirety. Briefly, a first leucine zipper is 
immobilized on a solid support and the second leucine zipper is coupled to a specific binding 
partner for an analyte in a biological fluid. The two peptides are then brought into contact thereby 

10 immobilizing the binding partner on the solid phase. The biological sample is then contacted with 
the immobilized binding partner and the amount of analyte in the sample bound to the binding 
partner determined. In still another preferred embodiment, the LZP or part thereof may be used to 
synthesize novel nucleic acid binding proteins which are able to multimerize with proteins of 
interest, for example to inhibit and/or control cellular growth using any genetic engineering 

15 technique known to those skilled in the art including the ones described in the US patent 5,942,433, 
which disclosure is hereby incorporated by reference in its entirety . 

In another embodiment, the invention relates to compositions and methods using the LZP or 
part thereof or derivative thereof in protein fragment complementation assays to detect 
biomolecular interactions in vivo and in vitro as described in international patent WO9834120, 

20 which disclosures is hereby incorporated by reference in its entirety. Such assays may be used to 
study the equilibrium and kinetic aspects of molecular interactions including protein-protein, 
protein-nucleic acid, protein-carbohydrate and protein-small molecule interactions, for screening 
cDNA libraries for binding to a target protein with unknown proteins or libraries of small organic 
molecules for biological activity. 

25 Still, another object of the present invention relates to the use of the LZP or part thereof for 

identifying new leucine zipper domains using any techniques for detecting protein-protein 
interaction known to those skilled in the art. Among the traditional methods which may be 
employed are co-immunoprecipitation, crosslinking and co-purification through gradients or 
chromatographic columns of cell lysates. Once isolated as a protein interacting with the LZP, such 

30 an intracellular protein can be identified (e.g. its amino acid sequence determined) and can, in turn, 
be used, in conjunction with standard techniques, to identify other proteins with which it interacts. 
The amino acid sequence thus obtained may be used as a guide for the generation of oligonucleotide 
mixtures that can be used to screen for gene sequences encoding such intracellular proteins. 
Screening may be accomplished, for example, by standard hybridization or PCR techniques. 

35 Techniques for the generation of oligonucleotide mixtures and the screening are well-known. (See, 
e.g., Ausubel et al. y eds., Current Protocols in Molecular Biology, J.Wiley and Sons (New York, 
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NY 1993) and PR Protocols: A Guide to Methods and Applications, 1990, Innis, M. et al., eds. 
Academic Press, Inc., New York ). 

Alternatively, methods may be employed which result in the simultaneous identification of 
genes which encode the intracellular proteins that can dimerize with the LZP or part thereof using 
5 any technique known to those skilled in the art. These methods include, for example, probing cDNA 
expression libraries, in a manner similar to the well known technique of antibody probing of 
lambda. gtl 1 libraries, using as a probe a labeled version of the LZP or part thereof, or fusion 
protein, e.g., the LZP or part thereof fused to a marker (e.g., an enzyme, fluor, luminescent protein, 
or dye), or an Ig-Fc domain (for technical details on screening of cDNA expression libraries, see 
10 Ausubel et al, supra). Alternatively, another method for the detection of protein interaction in vivo, 
the two-hybrid system, may be used. 

Protein of SEQ ID NO: 261 (internal designation 1 16-054-3-0-E6-CS) 

The 233 amino acids protein of SEQ ID NO: 261 encoded by the cDNA of SEQ ID NO: 20 
displays two leucine zipper sites at positions 142-163 and 170-191. 
15 It is believed that the protein of SEQ ID NO: 261 is able to dimerize either with itself 

(homo-dimerisation) or with an heterologous protein (hetero-dimerisation) of interest, through the 
mediation of its leucine zipper domain. Preferred polypeptides of the invention are polypeptides 
comprising fragments of SEQ ID NO: 261 from position 142-163 and 170-191, and fragments 
having any of the biological activities described herein. 

20 Protein of SEQ ID NO: 2 63 (internal designation 1 1 6-055-2-0-F7-CS) 

The protein of SEQ ID NO: 263 encoded by the cDNA of SEQ ID NO: 22 displays a 
leucine zipper pattern situated near its its NH2 terminal part (position 15 to 36). 

It is believed that the protein of SEQ ID NO: 263 is able to dimerize either with itself 
(homo-dimerisation) or with an heterologous protein (hetero-dimerisation) of interest, through the 
25 mediation of its leucine zipper domain. Preferred polypeptides of the invention are polypeptides 
comprising fragments of SEQ ED NO: 263 from position 15 to 36, and fragments having any of the 
biological activities described herein.. 

Protein ofSEQIDNO:245 (internal designation 1 05-026- 1-0-A5-CS) 

The protein of SEQ ID NO:245 encoded by the cDNA of SEQ ID NO:4 displays a leucine 
30 zipper pattern situated near its COOH terminal part (position 371 to 392). 

It is believed that the protein of SEQ ID NO: 245 is able to dimerize either with itself 
(homo-dimerisation) or with an heterologous protein (hetero-dimerisation) of interest, through the 
mediation of its leucine zipper domain. Preferred polypeptides of the invention are polypeptides 
comprising fragments of SEQ ID NO: 245 from position 371 to 392, and fragments having any of 
35 the biological activities described herein. 
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Protein ofSEQIDNO: 257 (internal designation 1 06-04 3-4-0-H3-CS) 

The 265-amino-acid-long protein of SEQ ID: 257 encoded by the cDNA of SEQ ID NO: 16 
exhibits homology to the Homo sapiens hypothetical protein (Genbank accession number AJ278482). 
These two proteins are probably the result of an alternative splicing. 
5 The protein of SEQ ID NO: 257 displays a leucine zipper pattern situated from position 155 

to 176. Thus, it is believed that the protein of SEQ ID NO: 257 is able to dimerize either with itself 
(homo-dimerisation) or with an heterologous protein (hetero-dimerisation) of interest, through the 
mediation of its leucine zipper domain. Preferred polypeptides of the invention are polypeptides 
comprising leucine zipper domains fragments and fragments having any of the biological activities 
10 described herein. 

Protein of SEQ ID NO: 314 (internal designation 188-4 1-1 -0-B8-CS.cor) 

A growing number of proteins have been shown to undergo post-translational modification 
by fatty acids that are covalently linked to cysteine residues through a thioester bond. Fatty acid 
modifications contribute to intracellular protein localization by facilitating membrane binding and 
15 also by strengthening protein-protein interactions. Cycles of palmitoylation and depalmitoylation 
have been described for a number of intracellular proteins, but the relevant enzymes that catalyze 
these processes have yet to be fully characterized and the full significance of these cycles remains to 
be elucidated. 

Palmitoyl-protein thioesterase-1 (PPT1) is a lysosomal hydrolase that removes long-chain 
20 fatty acyl groups from modified cysteine residues in proteins. Mutations in PPT! have been found 
to cause the infantile form of neuronal ceroid lipofuscinosis (INCL). 

Soyombo and Hofmann (J. Biol. Chem. 272: 27456-27463 [1997]) identified cDNAs 
encoding PPT2. The deduced PPT2 protein contains 302 amino acids, including a 27-amino acid 
leader peptide, a sequence motif characteristic of many thioesterases and lipases, and 5 potential N- 
25 linked glycosylation sites. PPT2 shares 18% amino acid identity with PPT1. Soyombo and 

Hofmann tentatively localized the human PPT2 gene to 6p21 .3. Northern blot analysis detected a 
predominant 2.0-kb PPT2 transcript in the human tissues examined, with the highest expression in 
skeletal muscle; variable amounts of 2.8- and 7.0-kb transcripts were also observed. 

Cell fractionation studies indicate that PPT2 is present in the lysosomal fraction. 
30 Immunoblot analysis of recombinant PPT2 expressed in mammalian cells showed 6 PPT2 proteins 
ranging in size from 31 to 42 kDa. Treatment that removes asparagine-1 inked oligosaccharides 
resulted in a single major protein of 31 kDa and a minor protein of 33 kDa. 

Recombinant PPT2, like PPT1, possesses thioesterase activity and localizes to the 
lysosome. Since PPT2 could not substitute for PPT1 in correcting the metabolic defect in INCL 
35 cells and was unable to remove palmitate groups from palmitoylated proteins, it appears that PPT2 
possesses a different substrate specificity than PPT1. Another study, however, was able to show, 
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after expression of the recombinant protein in a baculovirus system and using cell lysate as 
substrate, that the protein had S-thioesterase activity with a preference for acyl groups palmitic and 
myristic acid. 

The subject invention provides the protein/polypeptide of SEQ ED NO:3 14, encoded by the 
5 cDNA of SEQ ED NO:73. The invention also provides biologically active fragments of SEQ ID 
NO:314. In one embodiment, the polypeptides of SEQ ED NO:314 are interchanged with the 
corresponding polypeptide encoded by the human cDNA of clone 188-41-1-0-B8-CS. 
"Biologically active fragments" are defined as those peptide or polypeptide fragments having at 
least one of the biological functions of the full length protein (e.g., removal of long-chain fatty acyl 

10 groups from modified cysteine residues in proteins). Compositions of the protein/polypeptide of 
SEQ ED NO:3 14, or biologically active fragments thereof, are also provided by the subject 
invention. These compositions may be made according to methods well known in the art. 

The invention also provides variants of the protein of SEQ ED NO:314. These variants have 
at least about 80%, more preferably at least about 90%, and most preferably at least about 95% 

15 amino acid sequence identity to the amino acid sequence encoded by SEQ ID NO: 73. Variants 
according to the subject invention also have at least one functional or structural characteristic of the 
protein of SEQ ED NO:3 14. The invention also provides biologically active fragments of the 
variant proteins. Compositions of variants, or biologically active fragments thereof, are also 
provided by the subject invention. These compositions may be made according to methods well 

20 known in the art. Unless otherwise indicated, the methods disclosed herein can be practiced 

utilizing the protein encoded by SEQ ID NO:73, biologically active fragments of SEQ ID NO:314, 
variants of SEQ ID NO:314, and biologically active fragments of the variants. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence of SEQ ID NO:314. In a preferred embodiment, SEQ ID NO:314 

25 is encoded by clone 188-41-1-0-B8-CS or the cDNA of SEQ ED NO:73. It is well within the skill 
of a person trained in the art to create these alternative DNA sequences which encode proteins 
having the same, or essentially the same, amino acid sequence. These variant DNA sequences are, 
thus, within the scope of the subject invention. As used herein, reference to "essentially the same" 
sequence refers to sequences that have amino acid substitutions, deletions, additions, or insertions 

30 that do not materially affect biological activity. Fragments retaining one or more characteristic 
biological activity of the protein encoded by clone 188-41-1-0-B8-CS are also included in this 
definition. 

In one aspect of the subject invention, SEQ ID NO:314, and variants thereof, can be used to 
generate polyclonal or monoclonal antibodies. Both biologically active and immunogenic 
35 fragments of SEQ ID NO:314, or variant proteins, can be used to produce antibodies. Polyclonal 
and/or monoclonal antibodies can be made according to methods well known to the skilled artisan. 
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Antibodies produced in accordance with the subject invention can be used in a variety of detection 
assays known to those skilled in the art. 

SEQ ED NO:314 can be used as a marker for identification of lysosome dysfunction in 
individuals. In this aspect of the subject invention, antibodies specific for SEQ ID NO:314, or 
5 fragments thereof, are used in routine immunoassays to screen for the presence or absence of SEQ 
ID NO:314, or fragments thereof, in samples containing lysosomal contents. The presence or 
absence of the protein of SEQ ID NO:3 14 can be used to provide an indication of lysosomal 
function and is, thus, useful for diagnostic/prognostic identification of lysosomal dysfunction. 

The subject invention also provides materials and methods for the screening of individual 

10 samples for the presence or absence of nucleic acids encoding the protein of SEQ ID NO:314, or 
variants thereof. In one embodiment, nucleic acids are provided for hybridization assays, known to 
those skilled in the art, of mRNA or cDNA. The hybridization assays are performed upon nucleic 
acid samples obtained, or derived from, an individual with suspected lysosomal dysfunction. The 
hybridization assays screen for the presence or absence of nucleic acids encoding SEQ ID NO:314, 

15 or variants thereof. The presence or absence of such nucleic acids can be used as a 
predictive/prognostic indicator of disease state or lysosome function. 

Nucleic acids of the invention can also be used in gene replacement or gene therapy 
protocols. This aspect of the subject invention nucleic acids encoding SEQ ED NO:314, or 
biologically active fragments thereof, can be introduced into cells and implanted into an individual 

20 with lysosomal disorders. In one embodiment, genetically engineered macrophage can be used for 
the treatment regimen (see, for example, Eto and Ohashi [2000] J. Inherit. Metabol. Dis. 23:293- 
298). Alternatively, autologous cells may be obtained from an individual, transformed with nucleic 
acid ex vivo, expanded ex v/vo, and reintroduced into the individual. Such methods are well known 
to the skilled artisan. 

25 Protein of SEQ ID NO: 280 (internal designation J 60- 7 5-4-0- A 9-CS): 

The protein of SEQ ED NO:280, encoded by the cDNA of SEQ ID NO:39 and expressed in 
the fetal brain, is a chromosome 12 paralog of C7orf2, a human protein described as a 
transmembrane receptor located on chromosome 7 (Heus, H. C, A. Hing, et al. (1999) Genomics 
57(3): 342-51). In addition, this protein is an ortholog of the murine gene LMBR1L, found to be 

30 involved in polydactily in mice (Clark, R. M., P. C. Marker, et al. (2000) Genomics 67(1): 19-27). 
A high level of homology was also found with a gene identified in Fugu rubripes (AF0561 16), as 
well as with C. Elegans R05D3.2 (Gellner, K. and S. Brenner (1999) Genome Res 9(3): 251-8). 

The 362-amino-acid-long protein of SEQ ID NO:280, encoded by the cDNA of SEQ ID 
NO:39 is a splice variant of Z64989, located on chromosome 12. The chromosome 12 gene has 6 

35 known variants described in entries AK001356 and AK001651 in genbank and entries A26354, 
A26375, X27360 and Z64989 in geneseqn. The closest sequence is Z64989, either at the nucleotide 
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or the protein level. Z64989 is split into 1 7 exons, of which the protein of the invention contains the 
last 14. The transcription start site of the cDNA of SEQ ID NO:39 lies within the third intron of 
Z64989, and the protein of the invention starts at position 128 of Z64989. In addition, 2 potential 
leucine zippers are present in the protein of the invention (positions 136-157 and 272-293). 
5 Preaxial Polydactyly is a congenital hand malformation that includes duplicated thumbs, 

various forms of triphalangeal thumbs, and duplications of the index finger. Clark et al. (supra) 
demonstrated the correspondence between the spatial and temporal changes in Lmbrl expression 
and the embryonic onset of Polydactyly mutant phenotype, suggesting that a downregulation of 
Lmbrl results in polydactily. It is likely that the Lmbrl gene is involved in the patterning of limbs 

10 during mammalian development, for example by receiving and transducing a locally secreted ligand 
in the developing limb. 

It is believed that the protein of SEQ ED NO:280 is a paralog of human C7orf2, and is thus 
a membrane bound protein implicated in the patterning of the mammalian body plan during early 
development. For example, the protein of the invention may be involved in organizing limb 

15 development, as well as in the development of the fetal brain. As such, the activity of the present 
protein likely influences various cellular processes, including gene expression, cellular growth and 
proliferation, as well as cellular differentiation. In addition, leucine zippers within the present 
protein render the protein capable of undergoing specific protein-protein interactions with other 
leucine-zipper containing proteins, including with itself (i.e. homodimerization). Preferred 

20 polypeptides of the invention are fragments of SEQ ID NO:280 having any of the biological 
activities described herein. 

In one embodiment of the present invention, the present protein can be used to identify cells 
of the fetal brain. For example, the protein of the invention or part thereof may be used to 
synthesize specific antibodies using any technique known to those skilled in the art. Such tissue- 

25 specific antibodies may then be used to identify tissues of unknown origin, such as in forensic 
samples, differentiated tumor tissue that has metastasized to foreign bodily sites, etc., or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. In addition, 
labeled reagents that can specifically bind to the protein of the invention can be used to visualize 
cell membranes and the components of the secretory pathway in cells, e.g. the ER and Golgi. 

30 In another embodiment of the present invention, the present protein can be used to 

diagnose developmental abnormalities, or the potential for such abnormalities, e.g. in a fetus or in 
adults to determine (i.e. to determine if they are a carrier of a mutant copy of the gene). Individuals 
found to carry one or two mutant copies of the present gene would be candidates for, e.g. gene 
therapy or other strategies to correct or compensate for the gene deficiency, or for strategies to 

35 ensure that their children would not be carriers of the mutated gene. The characterization of 

mutations in genes encoding the present protein would also be of great value in understanding the 
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nature of Polydactyly and other developmental disorders, thereby facilitating the development of 
other strategies for treating and preventing these disorders. 

In another embodiment, the present protein is used to modulate gene expression, cell growth 
and proliferation, and/or cell differentiation in cells in vitro or in vivo. For example, any of these 
5 behaviors can be increased or inhibited in cells grown in vitro, e.g. for protein production or for ex 
vivo therapeutic strategies. In addition, any disease associated with an increase or decrease in any 
of these cellular behaviors in vivo can be treated or prevented by enhancing or inhibiting the 
expression or activity of the protein of the invention in cells in vivo. 

Proteins ofSEQID NOs: 309 and 304 (internal designations 188-1 1-1 -0-B3-CS and 187-34-0-0- 
10 112-CS) 

The proteins of SEQ ID NOs: 309 and 304 are encoded by the cDNAs of SEQ ID NOs: 68 
and 63. Accordingly, it will be appreciated that all characteristics and uses of the polypeptides of 
SEQ ID NOs: 309 and 304 described throughout the present application also pertain to the 
polypeptides encoded by human cDNA of clones 188-1 1-1 -0-B3-CS and 1 87-34-0-0-1 12-CS. In 

15 addition, it will be appreciated that all characteristics and uses of the nucleic acids of SEQ ID NOs: 
68 and 63 described throughout the present application also pertain to the nucleic acids of the 
human cDNAs of clones 188-1 1-1-0-B3-CS and 187-34-0-0-1 12-CS. 

The protein of SEQ ID NO: 309 (encoded by the clone having internal designation number 
188-1 1-1 -0-B3-CS) and the polymorphic variant thereof of SEQ ID NO: 304 (encoded by the clone 

20 having internal identification number 1 87-34-0-0-1 12-CS and which differs from the polypeptide 
encoded by the clone having internal designation number 188-1 1-1-0-B3CS at a single amino acid), 
are highly homologous to the first 279 amino acids of the LG11 (Leucine-rich gene — Glioma 
Inactivated) protein. Clones 188-1 1-1 -0-B3-CS and 187-34-0-0-1 12-CS appear to be splicing and 
polymorphic variants of LGI1. The LGI1 protein is 557 amino acid in length. (See Somerville et 

25 al., (2000) Mammalian Genome 1 1, 622-627 ; Chernova, et al. (1998) Oncogene 17, 2873-2881, 
the disclosures of which are incorporated herein by reference in their entireties). Clone 188-1 1-1-0- 
B3-CS align with the first 279 amino acids of LGI1, followed by the addition of 12 amino acids 
(VLREMRFTNMS) to the C-terminal end which do not appear to be homologous to LGI1 . Like 
LGI1, clone 188-1 1-1 -0-B3-CS and the polymorphic variant 187-34-0-0-1 12-CS contain the LRR 

30 domain and are highly expressed in brain tissue. 

LGI1 belongs to a large family of leucine-rich repeat (LRR) proteins. It is believed that the 
LRR domains act as a region of protein-protein interaction. This has been substantiated as the 
family of known LRR proteins has grown. Leucine-rich repeats have been identified as essential 
components in glycoprotein hormone receptors, proteoglycans and the Trk proteins by expression of 

35 mutants and artificial chimaeras in tissue culture and by biochemical analysis of the properties of 
these constructs. Many transmembrane LRR proteins are known or suspected to encode truncated 
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forms (N and L 6 , and slit for example) with functional significance. The proteoglycan Decorin, a 
secreted protein, binds TGF-p, a growth factor which stimulates decorin expression. Since decorin 
inhibits growth of cultured cells, it may form part of a negative feedback loop to regulate cell 
growth. This is similar to the proposed function of the LGI1 receptor protein. 
5 Analysis of brain gliomas has revealed that LGI1 expression is either abolished or greatly 

reduced in high-grade tumors compared with more benign ones, indicating a role as a tumor 
suppressor gene (Cowell et al. 2000; Cowell et al. 1998, the disclosure of which is incorporated 
herein by reference in its entirety). Most glioblastoma multiforme (GBM) brain tumors contain 
only one genomic copy of LGI1, and this one is almost invariably not expressed. How the gene is 

10 inactivated is not clear, although one possibility is that chromosome or gene rearrangement, which 
occur in 20-25% of tumors, cause inactivation as a result of a positional effect. Recently it was 
determined that the LGI1 gene is located on 10q24, and is disrupted by translocation in the T98G 
GBM cell line and is also rearranged in over 26% of primary brain tumors. Alternatively, LGI1 
may be part of a highly regulated pathway where inactivation of other key members or high specific 

15 transcription factors results in either inactivation of all genes in the pathway or a failure to initiate 
transcription. 

Since functional inactivation of LGI1 occurs during the transition of low-grade to high- 
grade brain tumors, knockout or transgenic mice in which the expression of the protein of SEQ ID 
NO:309 or 304 has been reduced, eliminated or altered may be used as disease model. In particular, 

20 mice that overexpress LGI1 may be used as a tumori genesis model. 

Mice are particularly useful as models for assessing the consequences of altering the level 
or activity of the proteins of SEQ ID NO:309 or 304 or to identify agents useful in treaating 
tumori genesis, since human and mouse LGI1 are highly conserved, showing 91% identity at the 
nucleotide level and 97% similarity at the amino acid level, with most of the amino acid 

25 substitutions being conservative. The mouse lgil gene is 4.2 kb in length, while the human LGI1 is 
2.2 kb in length. This difference in size between the human and mouse gene is a result of the 
inclusion of a 2 kb sequence in the 3' untranslated region in the mouse gene. Whether the 
additional sequence affects gene expression is not clear. Further analysis of the genomic sequence 
reveals that the number of exon/intron boundaries is also similar in humans and mice. The high 

30 degree of LG11 conservation between mice and humans implies that this gene has experienced a 
strong selection pressure. It is intriguing to speculate that any major deviations in the primary 
protein sequence may result in a loss of function of this gene product. Total or partial loss of the 
LGI1 gene function could, therefore, be lethal, which in turn implies that LGI1 plays an important 
role in normal brain development as well as in tumor formation. 

35 SEQ ID NOs:309 and 304 also have high homology with Slit, a secreted Drosophila protein 

which plays a role in the development of axon pathway development in the central nervous system. 
The Slit protein is necessary for the normal development of the midline on the CNS, particularly the 
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midline glial cells, and for the concomitant formation of the commissural axon pathway. The 
process is dependent on the level of Slit protein expression. It appears that the Slit protein is 
excreted by the midline glial cells, where it is synthesized and is eventually associated with the 
surface axons that traverse them. Contact of cells with supernatant expressing the product of this 
5 gene increases the permeability of THP-1 monocyte cells to calcium. Thus, it is likely that Slit is 
involved in a signal transduction pathway that is initiated when Slit protein binds a receptor on the 
surface of the monocyte cell. 

In view of the above, it is believed that the proteins of SEQ ID NOs:309 and 304 are 
involved in a signal transduction pathway mediated through a receptor that modulates the 

10 differentiation and/or proliferation of cells. 

Northern blot analysis detects LGI1 transcripts only in brain, neural tissue, and skeletal 
muscle but not in heart, kidney, lung, placenta, liver, or pancreas. Northern blot analysis of RNA 
derived from several different regions of human brain revealed a widespread expression of LGI1 
although with different intensities. The highest abundance was found in cerebral cortex, 

15 hippocampus, and putamen. The lowest expression was detected in corpus callosum. The levels of 
expression were intermediate in the other brain regions. Accordingly, the proteins of SEQ ID 
NOs:309 or 304 or fragments thereof, as well as polynucleotides encoding the proteins of SEQ ID 
NOs:309 or 304, may be used to determine whether a tissue sample is derived from brain (and in 
particular cerebral cortex, hippocampus, or putamen), neural tissue, and skeletal tissue or to 

20 distinguish whether a tissue sample is derived from brain or another tissue, such as heart, kidney, 
lung, placenta, liver, or pancreas. 

Accordingly, the present invention includes the use of the protein of SEQ ID NOs: 309 or 
304, fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

25 ameliorate a condition, such as those listed above, in an individual. In such embodiments, the 

protein of SEQ ID NO:309 or 304, or a fragment thereof, is administered to an individual in whom 
it is desired to increase or decrease any of the activities of the protein of SEQ ID NO:309 or 304, 
including tumor suppression, modulation of neural development or involvement in brain tumors, 
glioblastoma multiforme, brain injuries, neurodegenerative disease states and behavioral disorders 

30 such as Alzheimers Disease, Parkinsons Disease, epilepsy, multiple sclerosis, Huntingtons Disease, 
schizophrenia, obsessive compulsive disorders, and in the processes of nerve regeneration in spinal 
cord injury, stroke, facial nerve damage, diabetes caused nerve damage, and retinal regeneration. 

The protein of SEQ ID NO:309 or 304 or a fragment thereof may be administered directly 
to the individual or, alternatively, a nucleic acid encoding the protein of SEQ NO:309 or 304 or a 

35 fragment thereof may be administered to the individual. Alternatively, an agent which increases 
the activity of the protein of SEQ ID NO:309 or 304 may be administered to the individual. Such 
agents may be identified by contacting the protein of SEQ NO:309 or 304 or a cell or preparation 
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containing the protein of SEQ ID NO:309 or 304 with a test agent and assaying whether the test 
agent increases the activity of the protein. For example, the test agent may be a chemical compound 
or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:309 or 304 may be decreased by 
5 administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:309 or 304 may be identified by contacting the 
protein or a cell or preparation containing the with a test agent and assaying whether the test agent 
decreases the activity of the protein. For example, the agent may be a chemical compound, a 
polypeptide or peptide, an antibody, or a nucleic acid such as an antisense nucleic acid or a triple 

1 0 helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably brain, or 
to distinguish between two or more possible sources of a tissue sample on the basis of the level of 
the protein of SEQ ID NO:309 or 304 in the sample. For example, the protein of SEQ ID NO:309 

15 or 304 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such tissue-specific antibodies may then 
be used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor 
tissue that has metastasized to foreign bodily sites, or to differentiate different tissue types in a 
tissue cross-section using immunochemistry. In such methods a tissue sample is contacted with the 

20 antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from brain or tissues other than brain to determine whether the test sample is from 
brain. Alternatively, the level of the protein of SEQ ID NO:309 or 304 in a test sample may be 
measured by determining the level of RNA encoding the protein of SEQ ID NO:309 or 304 in the 

25 test sample. RNA levels may be measured using nucleic acid arrays or using techniques such as in 
situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the art. If 
desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic acid 
sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in control 
cells from brain or tissues other than brain to determine whether the test sample is from brain. For a 

30 number of disorders listed above, particularly of the nervous system, expression of the genes 

encoding the polyepeptide of SEQ ID NO:309 or 304 at significant higher or lower levels may be 
routinely detected in certain tissues or cell types (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, synovial fluid, and spinal fluid) or another tissue of cell sample taken 
from an individual having such a disorder, relative to the standard gene expression level, i.e., the 

35 expression level in healthy tissue or bodily fluid from an individual not having the disorder. 

In another embodiment, antibodies to the protein of SEQ ID NO:309 or 304 or part thereof 
may be used for detection, enrichment, or purification of cells expressing the protein of SEQ ID 
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NO:309 or 304, including using methods known to those skilled in the art. For example, an 
antibody against the protein of SEQ ID NO:309 or 304 or a fragment thereof may be fixed to a solid 
support, such as a chromatograpy matrix. A preparation containing cells expressing the protein of 
SEQ ID NO:309 or 304 is placed in contact with the antibody under conditions which facilitate 
5 binding to the antibody. The support is washed and then the cells are released from the support by 
contacting the support with agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:309 or 304 or a 
fragment thereof may be used to diagnose disorders associated with altered expression of the 
protein of SEQ ID NO:309 or 304. In some embodiments, the protein of SEQ ED NO:309 or 304 or 

1 0 fragments thereof may be used to diagnose cancer. In such techniques, the level of the protein of 
SEQ ID NO:309 or 304 in an ill individual is measured using techniques such as those described 
herein and compared to the level in normal individuals. For example, a decreased level of the 
protein of SEQ ID NO:309 or 304 relative to normal individuals suggests that the ill individual may 
suffer from cancer or be predisposed to getting cancer in the future. 

1 5 Another embodiment of the present invention is a polypeptide comprising a structural or 

functional domain of the protein of SEQ ED NO:309 or 304. Such structural or functional domains 
of the protein of SEQ ID NO:309 or 304 include a leucine rich repeat C-terminal domain located 
between amino acid positions 173 and 222, a leucine rich repeat located between amino acid 
positions 92 and 1 15, a leucine rich repeat located between amino acid positions 116 and 139, a 

20 leucine rich repeat located between amino acid positions 140 and 163, a leucine rich repeat located 
between amino acid positions 164 and 185, a membrane spanning segment located between amino 
acid positions 15 and 35, and a signal peptide comprising the sequence FLCLLSALLLTEG/KK. 

Accordingly, the protein of SEQ ID NO:309 or 304 or fragments thereof, or polynucleotides 
encoding these proteins or fragments, may be used in in vitro diagnostic assays for malignant brain 

25 tumors, such as glioblastoma muliforme. These proteins or nucleic acids may also be used in the 
attenuation / prevention and/or treatment of brain tumors and/or brain injuries, of neurodegenerative 
disease states and behavioral disorders such as Alzheimers Disease, Parkinsons Disease, epilepsy, 
multiple sclerosis, Huntingtons Disease, schizophrenia, obsessive compulsive disorders, and in the 
processes of nerve regeneration in spinal cord injury, stroke, facial nerve damage, diabetes caused 

30 nerve damage, and retinal regeneration. 

In addition, the protein, as well as, antibodies directed against the protein, and relevant 
small molecules may be used as tumor markers and /or immunotherapy targets for the above disease 
states. For example, antibodies directed against amino acids VLREWRFTNMS of both clones may 
aid in the differential detection of the secreted and receptor forms of this protein, since the proteins 

35 of SEQ ID NOs:309 and 304 have homology to the secreted forms of LGI 1 . In addition, the 

proteins of SEQ ID NOs:309 and 304 or fragments thereof may be used to identify binding partners 
as described herein. 
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DNA-binding proteins 

The invention relates to compositions and methods using proteins of the invention containing a 
DNA-binding domain, herein referred to as DBP, such as the ones described in this section and those 
containing a DNA binding domain domain as shown on Table VI, or parts thereof, preferably 
5 fragments comprising a DNA binding domain, or derivative thereof. 

Transcriptional regulation is primarily achieved by the sequence-specific binding of proteins to 
DNA and RNA. Of the known protein motifs involved in the sequence specific recognition of DNA, 
the zinc finger protein is unique in its modular nature. Zinc finger domains are found in numerous zinc 
binding proteins which are involved in protein-nucleic acid interactions. They are independently folded 

10 zinc-containing mini-domains which are used in a modular repeating fashion to achieve sequence- 
specific recognition of DNA (Klug 1993 Gene 135, 83-92). Such zinc binding proteins are commonly 
involved in the regulation of gene expression, and usually serve as transcription factors (see US patents 
5,866,325; 6,013,453 and 5,861,495). 

To date, zinc finger proteins have been identified which contain between 2 and 37 modules. 

15 More than two hundred proteins, many of them transcription factors, have been shown to possess zinc 
fingers domains. Zinc fingers connect transcription factors to their target genes mainly by binding to 
specific sequences of DNA. Zinc finger modules are found in a wide variety of transcription regulatory 
proteins in eukaryotic organisms. A zinc finger domain is generally composed of 25 to 30 amino acid 
residues which form one or more tetrahedral ion binding sites. The binding sites contain four ligands 

20 consisting of the sidechains of cysteine, histidine and occasionally aspartate or glutamate. The binding 
of zinc allows the relatively short stretches of polypeptide to fold into defined structural units which are 
well-suited to participate in macromolecular interactions (Berg, J. M. et al. (1996) Science 271:1081- 
1085). The zinc finger domain was first recognized in the transcription factor TFflUA from Xenopus 
oocytes (Miller, et al., EMBO, 4:1609-1614, 1985; Brown, et al., FEBS Lett., 186:271-274, (1985)). 

25 Zinc binding domains which contain a C 3 HC 4 sequence motif are known as RING domains 

(Lovering, R. et al. (1993) Proc. Natl. Acad. Sci. USA 90:21 12-21 16). The RING domain consists 
of eight metal binding residues, and the sequences that bind the two metal ions overlap (Barlow, P. 
N. et al. (1994) J. Mol. Biol. 237:201-21 1). Functions of RING finger proteins are mediated through 
DNA binding and include the regulation of gene expression, DNA recombination, and DNA repair 

30 (see Borden and Freemont, Curr Opin Struct Biol 6:395-401 (1996) and US patent 5,861,495). 

Both the RING finger and the LIM domain mediate protein-protein interactions and are 
involved in transcriptional control, either by directly affecting transcription or recruiting co-activators 
or co-repressors. LIM domains also contribute to various signalling pathways. They may interact with 
protein kinases and anchor gene products to large protein complexes or to cellular compartments. 

35 PHD fingers are C4HC3 zinc fingers spanning approximately 50-80 residues and distinct from 

RING fingers or LIM domains. They are thought to be mostly DNA or RNA binding domain but may 
also be involved in protein-protein interactions (for a review see Aasland et al, Trends Biochem Sci 
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20:56-59 (1995)). The PHD finger domain, belonging to zinc finger domain family, is found in many 
regulatory proteins which are frequently associated with chromatin-mediated transcriptional 
regulation. 

The nucleic acid binding activity of DBP or part thereof may be assayed using any of the 
5 assays known to those skilled in the art including those described in US patent 6,01 3,453. 

The invention relates to compositions and methods using DBPs or part thereof, especially 
fragments comprising a DNA-binding domain, to stimulate gene transcription. 

One of the remarkable features of activation domains of transcriptional factors in general is that 
"fusing" them to heterologous protein domains seldom affects their ability to activate transcription 

10 when recruited to a wide variety of promoters. The high degree of functional independence exhibited 
by these activation domains makes them valuable tools in various biological assays for analyzing gene 
expression and protein-protein or protein-RNA or protein-small molecule drug interactions. Several 
strategies to improve the potency of activation domains and thereby the expression of genes under their 
control have been reported. These approaches generally involve increasing the number of copies of 

15 activation domains fused to the DNA binding domain or generating activators containing synergizing 
combinations of activation domains. 

Therefore, in an additional embodiment, this invention provides compositions and methods 
containing new transcription factors comprising DBP or part thereof, preferably fragments containing 
DNA-binding domains. Such transcription factors may be designed to regulate the expression of target 

20 genes of interest. Aspects of the invention are applicable to systems involving either covalent or non- 
covalent linking of the transcription activation domain to a DNA binding domain. In practice, cells can 
be engineered by the introduction of recombinant nucleic acids encoding the fusion proteins containing 
at least two mutually heterologous domains, one of them being the DNA-binding domain of the 
invention, and in some cases additional nucleic acid constructs, to render them capable of ligand- 

25 dependent regulation of transcription of a target gene. Administration of the ligand to the cells then 
regulates (positively, or in some cases, negatively) target gene transcription (all laboratory methods 
related to this embodiment are completely described in US patents 6.015.709, which disclosure is 
hereby incorporated by reference in its entirety). Illustrative (non-limiting) example of heterologous 
domains which can be included along with a DNA-binding domain in various fusion proteins of this 

30 invention include another transcription regulatory domains (i.e., transcription activation domains such 
as a p65, VP 16 or AP domain; transcription potentiating or synergizing domains; or transcription 
repression domains such as an ssn-6/TUP-l domain or Kruppel family suppressor domain); a DNA 
binding domain such as a GAL4, lex A or a composite DNA binding domain such as a composite zinc 
finger domain or a ZFHD1 domain; or a ligand-binding domain comprising or derived from (a) an 

35 immunophilin, cyclophilin or FRB domain; (b) an antibiotic binding domain such as tetR: or (c) a 
hormone receptor such as a progesterone receptor or ecdysone receptor. A wide variety of ligand 
binding domains may be used in this invention, although ligand binding domains which bind to a cell 
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permeant ligand are preferred. It is also preferred that the ligand have a molecular weight under about 5 
kD, more preferably below 2.5 kD and optimally below about 1500 D. Non-proteinaceous ligands are 
also preferred. Examples of ligand binding domain/ligand pairs that may be used in the practice of this 
invention include, but are not limited to: FKBP:FK1012, FKBP:synthetic divalent FKBP ligands (see 
5 WO 96/0609 and WO 97/3 1898), FRB:rapamycin/FKBP (see e.g., WO 96/41865 and Rivera et al, "A 
humanized system for pharmacologic control of gene expression", Nature Medicine 2(9): 1028-1032 
(1997)), cyclophilinxyclosporin (see e.g. WO 94/18317), DHFR: methotrexate (see e.g. Licitra et al, 
1996, Proc. Natl. Acad. Sci. U.S.A. 93:12817-12821), TetR tetracycline or doxycycline or other 
analogs or mimics thereof (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. U.S.A. 89:5547; Gossen et 

10 al, 1995, Science 268:1766-1769; Kistner et al, 1996, Proc. Natl. Acad. Sci. U.S.A. 93:10933-10938), a 
progesterone receptor:RU486 (Wang et al, 1994, Proc. Natl. Acad. Sci. U.S.A. 91:8180-8184), 
ecodysone receptor: ecdysone or muristerone A or other analogs or mimics thereof (No et al, 1996, 
Proc. Natl. Acad. Sci. U.S.A. 93:3346-3351) and DNA gyrasexoumermycin (see e.g. Farrar et al, 
1996, Nature 383:178-181). In many applications it is preferable to use a DNA binding domain which 

15 is heterologous to the cells to be engineered. In the case of composite DNA binding domains, 
component peptide portions which are endogenous to the cells or organism to be engineered are 
generally preferred. 

In another aspect of this embodiment, polynucleotides encoding DNA-binding domains as well 
as any other functional fragments of DBP may be introduced into polynucleotides encoding fusion 

20 proteins for a variety of regulated gene expression systems, including both allostery-based systems such 
as those regulated by tetracycline, RU486 or ecdysone, or analogs or mimics thereof, and dimerization- 
based systems such as those regulated by divalent compounds like FK1012, FKCsA, rapamycin, 
AP1510 or coumermycin, or analogs or mimics thereof, all as described below (See also, Clackson, 
Controlling mammalian gene expression with small molecules, Current Opinion in Chem. Biol. 1:210- 

25 218 (1997)). The fusion proteins may comprise any combination of relevant components, including 
bundling domains, DNA binding domains, transcription activation (or repression) domains and ligand 
binding domains. Other heterologous domains may also be included. 

Another embodiment of this invention relates to expression systems, preferably vectors and 
vector-containing cells, using DBP or part thereof, especially the DNA-binding domain. In this regard, 

30 recombinant nucleic acids are provided which encode fusion proteins containing the transcription 
activation domain of the invention and at least one additional domain that is heterologous thereto, 
where the peptide sequence of said activation domain is itself eventually modified relative to the 
naturally occurring sequence from which it was derived to increase or decrease its potency as a 
transcriptional activator relative to the counterpart comprising the native peptide sequence. Each of the 

35 recombinant nucleic acids of this invention may further comprise an expression control sequence 
operably linked to the coding sequence and may be provided within a DNA vector, e.g., for use in 
transducing prokaryotic or eukaryotic cells. Some of the recombinant nucleic acids of a given 
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composition as described above, including any optional recombinant nucleic acids, may be present 
within a single vector or may be apportioned between two or more vectors. The recombinant nucleic 
acids may be provided as inserts within one or more recombinant viruses which may be used, for 
example, to transduce cells in vitro or cells present within an organism, including a human or non- 
5 human mammalian subject. It should be appreciated that non-viral approaches (naked DNA, liposomes 
or other lipid compositions, etc.) may be used to deliver recombinant nucleic acids of this invention to 
cells in a recipient organism. The resultant engineered cells and their progeny containing one or more 
of these recombinant nucleic acids or nucleic acid compositions of this invention may be used in a 
variety of important applications, including human gene therapy, analogous veterinary applications, the 

10 creation of cellular or animal models (including transgenic applications) and assay applications. Such 
cells are useful, for example, in methods involving the addition of a ligand, preferably a cell permeant 
ligand, to the cells (or administration of the ligand to an organism containing the cells) to regulate 
expression of a target gene. 

The invention also relates to methods and compositions using DBP or part thereof to bind to 

15 nucleic acids, preferably DNA, alone or in combination with other substances. For example, DBP 
or part thereof is added to a sample containing nucleic acid in conditions allowing binding, and 
allowed to bind to nucleic acids. In a preferred embodiment, DBP or part thereof may be used to 
purify nucleic acids such as restriction fragments. In another preferred embodiment, DBP or part 
thereof may be used to visualize nucleic acids when the polypeptide is linked to an appropriate 

20 fusion partner, or is detected by probing with an antibody. Alternatively, DBP or part thereof may 
be bound to a chromatographic support, either alone or in combination with other DNA binding 
proteins, using techniques well known in the art, to form an affinity chromatography column. A 
sample containing nucleic acids to purify is run through the column. Immobilizing DBP or part 
thereof on a support advantageous is particularly for those embodiments in which the method is to 

25 be practiced on a commercial scale. This immobilization facilitates the removal of the protein from 
the batch of product and subsequent reuse of the protein. Immobilization of DBP or part thereof can 
be accomplished, for example, by inserting a cellulose-binding domain in the protein. One of skill 
in the art will understand that other methods of immobilization could also be used and are described 
in the available literature. 

30 In another embodiment, the present invention relates to compositions and methods using 

DBP or part thereof, especially the DNA-binding domain, to alter the expression of genes of interest 
in a target cells. Such genes of interest may be disease related genes, such as oncogenes or 
exogenous genes from pathogens, such as bacteria or viruses using any techniques known to those 
skilled in the art including those described in US patents 5,861,495; 5,866,325 and 6,013,453. 

35 In still another embodiment, DBP or part thereof may be used to diagnose, treat and/or 

prevent disorders linked to dysregulation of gene transcription such as cancer and other disorders 
relating to abnormal cellular differentiation, proliferation, or degeneration, including 
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hyperaldosteronism, hypocortisolism (Addison's disease), hyperthyroidism (Grave's disease), 
hypothyroidism, colorectal polyps, gastritis, gastric and duodenal ulcers, ulcerative colitis, and 
Crohn's disease. 

Protein ofSEQ ID NO: 388 (internal designation 109-002-4-0-C6-CS) 
5 The protein of SEQ ID NO: 388 encoded by cDNA of SEQ ID NO: 147 is a 375 amino-acids 

long protein containing a zinc finger domain, namely a PHD-finger domain from positions 329 to 339. 

The PHD finger was originally identified by comparison of the maize homeodomain (HD) 
protein ZMHOXla (Bellmann R. and Werr W. EMBO J. 11: 3367-3374 (1992)) to its Arabidopsis 
relative HAT3.1 and named plant homeodomain (PHD) finger due to its association with the DNA- 

10 binding HD in both genes. This motif often occurs in various regulatory genes, such as members of the 
trithorax (TRX-G) or polycomb (PC-G) groups (Aasland R. et al. Trends Biochem.Sci. 20: 56-59 
(1995)) and leukaemia-associated proteins (LAP finger) (Saha V. et al. Proc.Natl.Acad.Sci. USA 92: 
9737-9741 (1995)). The established function of TRX-G and PC-G genes in chromatin modulation in 
Drosophila led to the suggestion that the PHD finger is involved in chromatin-mediated transcriptional 

15 control. Recent data provide evidence that PHD finger proteins are associated with chromatin 
remodelling complexes (Bochar D.A. et al. Proc.Natl.Acad.Sci. USA 97: 1038-1043 (2000)) or 
contribute to histone acetylation (Loewith R. et al. Mol.Cell.Biol. 20: 3807-3816 (2000)). Based on the 
position of the unique His residue, the cysteine scaffold of the PHD finger (Cys4-His-Cys3) is clearly 
distinct from RING fingers (Cys3-His-Cys4) and LIM domains (Cys2-His-Cys5) and from DRIL 

20 domains, where two RING finger motifs are closely linked. In contrast to the accumulating knowledge 
about LIM domains, functional data concerning the PHD finger remain rare (see rev. Halbach T. et al. 
Nucleic Acids Research 28: 3542-3550 (2000)). 

GYMNOS, a recently described member of the SW12/SNF2 protein family in plants (22), also 
contains a PHD finger and takes part in the control of development. The second PHD finger motif of 

25 Drosophila dMI-2 protein (a reference for animal counterparts) shares high sequence conservation to 
known plant PHD fingers. Due to the similarity to the Drosophila MI-2 gene, GYMNOS has been 
implicated in chromatin modulation. While the PHD finger is an isolated motif in GYMNOS, the 
characteristic Cys4-His-Cys3 scaffold in PHDf-HD plant genes is embedded in a large region. This 
region shares 60% identical residues between seven genes of different plant species and is more highly 

30 conserved than the HD (40%). This conservation suggests that the PHD finger is part of a larger 
functional unit. When combined with a leucine zipper in the surrounding conserved 180 amino acid 
region in the PHDf-HD proteins, PHD finger activity is masked and silenced. The leucine zipper 
upstream of the PHD finger mediates interactions with helix 4 of plant 14-3-3 proteins, thus identifying 
PHDf-HD proteins as potential targets of 14-3-3 signalling pathways. The 14-3-3 family of 

35 multifunctional proteins is highly conserved between animals, plants and yeast. Due to the dimeric 
nature of 14-3-3 proteins and their capacity to form homo- and heterodimers, members of the 14-3-3 
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protein family function as scaffolds promoting association of protein complexes. 14-3-3 proteins are 
involved in various signalling pathways that include, for example, Raf, BAD, Bcr/Bcr-Abl, KSR 
(kinase supressor of Ras), PKC, PI-3 kinase and cdc25C phosphatase. Others enter the nucleus and are 
associated with DNA-binding complexes. Recent data even indicate contacts to TBP, TFIIB and the 
5 human TBP-associated factor hTAF(II)32 (for rev.see Halsbach T., supra ). 

Recently PHD finger has been shown to activates transcription in yeast, plant and animal cells. 
Transcriptional activation in animal cells (in the zebrafish embryo as a test system) tested for different 
PHD fingers seems to be a general feature of the PHD finger motif in eukaryotic cells. 

It remains to be elucidated whether the PHD finger directly interacts with a component of the 

10 transcription initiation complex or if its positive effect on transcription is mediated via auxiliary protein 
interactions. Both assumptions, however, involve PHD finger-mediated protein-protein interactions. 
Surrounding sequences may interfere sterically with accession of the PHD finger and its exposure 
could eventually depend on binding of a protein partner. 

The PHD finger containing proteins appear to be involved in human diseases. Studies on the 

15 AIRE gene from humans (Nagamine K. et al. Nat.Genet. 17: 393-398 (1997), Scott H.S. et al. 
Mol.Endocrinol. 12: 1 1 12-1 119 (1998)) have shed more light on the importance of this motif, since all 
clinically significant mutations in the AIRE gene coincide with alteration in two PHD fingers, resulting 
in the rare autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED). The 
presence of PHD fingers in genes up-regulated in leukaemia, associated with the autoimmune disease 

20 APECED or participating in euchromatin to heterochromatin modulation, like the TRX-G or PC-G 
genes, indicates that this motif may be involved in a variety of important cellular events including 
developmental disorders, tumors and immune diseases. For exemple, the role of a chromatin structure 
remodelling in cancer metastasis and tissue carcinogenesis is well documented (Zhang Y. et al. Cell 16: 
279-289 (1998); Klugbauer S. and Rabes H.M. Oncogene 29: 4388-4393 (1999)). 

25 It is believed that the protein of SEQ ID NO: 388 or part thereof is a zinc binding protein, 

preferably able to bind nucleic acids, more preferably a transcription factor. Preferred polypeptides of 
the invention are polypeptides comprising the amino acids of SEQ ID NO: 388 from positions 329 to 
339. Other preferred polypeptides of the invention are fragments of SEQ ID NO: 388 having any of the 
biological activity described herein. 

30 In one embodiment of the invention, the protein of the invention, or part thereof, or derivative 

thereof, may be used to a subject to diagnose developmental disorders and/or cell proliferative disorders 
linked to dysregulation of gene expression mediated by the PHD-finger domain of the protein of the 
invention. Such disorders include but are not limited to, renal tubular acidosis, anemia, Cushing's 
syndrome, achondroplastic dwarfism, epilepsy, gonadal dysgenesis, hereditary neuropathies such as 

35 Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders 
such as Syndenham's chorea and cerebral palsy, spinal bifida, and congenital glaucoma, cataract, 
sensorineural hearing loss, benign tumors, and cancers such as adenocarcinoma; leukemia; melanoma; 
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lymphoma; sarcoma; and cancers of the bladder, colon, liver, brain, small intestine, large intestine, 
breast, ovary, kidney, lung, and prostate. Diagnosis may be performed using nucleic acids or 
antibodies able to detect the expression of the protein of the invention using any technique known to 
those skilled in the art including Northern blotting, RT-PCR, immunoblotting methods 
5 immunohistochemisty, enzyme-linked immunosorbant assay (ELISA) described herein. Quantities of 
the protein of the invention expressed in subject samples, control and disease from biopsied tissues or 
body fluids or cell extracts taken from patients are compared with the standard values. Deviation 
between standard and subject values establishes the parameters for diagnosing disease. 

In another embodiment, antagonists or inhibitors of the protein of the invention or part thereof 

10 may be administered to patients to treat and/or prevent the above referred disorders. Antagonists or 
inhibitors of transcriptional activators may indeed be used to suppress transcriptional activation in 
tumor cells. Such antagonists and/or inhibitors may be antibodies specific for the protein of the 
invention that can be used directly as an antagonist, or indirectly as a targeting or delivery mechanism 
for bringing a pharmaceutical agent to cells or tissue which express the protein of the invention. 

15 Neutralizing antibodies, (i.e., those which inhibit protein-protein interactions) are especially preferred 
for therapeutic use. Other methods to inhibit the expression of the protein of the invention include 
antisense and triple helix stategies as described herein. Other antagonists or inhibitors of the protein of 
the invention may be produced using methods which are generally known in the art, including the 
screening of libraries of pharmaceutical agents to identify those which specifically bind the protein of 

20 the invention. The protein of the invention, or part thereof, preferably its functional or immunogenic 
fragments, or oligopeptides related thereto, can be used for screening libraries of compounds in any of a 
variety of drug screening techniques. The fragment employed in such screening may be free in solution, 
affixed to a solid support, borne on a cell surface, or located intracellularly. The formation of binding 
complexes, between the protein of the invention, or part thereof, or derivative thereof, and the agent 

25 being tested, may be measured. Another technique for drug screening which may be used provides for 
high throughput screening of compounds having suitable binding affinity to the protein of the invention 
as described in published PCT application WO84/03564. 

Protein ofSEQIDNO: 394 (internal designation 1 57-17-2-0-C1-CS) 

The protein of SEQ ID NO: 394 encoded by the extended cDNA SEQ ID NO: 153 contains 

30 a myc-type, helix-loop-helix dimerization domain (Prosite PS00038) from amino acid position 13 to 
28 and has no adjacent basic domain. Using the Schiffer-Edmundson helical wheel diagram 
(Schiffer et al. (1967) Biophys.J. 7:121-135), a hypothetical amphipatic alpha helix is predicted 
between position 53 and position 68. Three hydrophobic amino acids, Val 55, Phe59 and Ile63, are 
aligned on the same side of the helix to present a hydrophobic interaction surface and three 

35 hydrophilic residues (Tyr53, Gln62 and Ser64) are presented on the other side of helix. There is no 
Proline residue within the stretch to disrupt the continuity of the alpha helix. Thus, these structural 
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features in the protein of the invention indicates that this protein could be a novel member of the 
nonbasic "helix-loop-helix" subfamily (HLH) of transcription regulator. 

The helix-loop-helix (HLH) family of transcriptional regulators is involved in the control of 
different cellular differentiation phenomenon such as neurogenesis, haematopoiesis, myogenesis 
5 and angiogenesis. The HLH proteins are found in all eukaryotic organisms ranging from yeast 
saccharomyces cerevisiae to human (Reviewed by Massari ME and Murre C. (2000) Molecular and 
Cellular Biology, 20 (2):429-440). The HLH proteins bind DNA as dimers, and different members 
of HLH family bind either as homodirners or as heterodimers with other members of the family. 
The presence in a cell of a large repertoire of distinct complexes that can bind to a particular DNA 

10 sequence element suggests that competition for DNA binding may play a regulatory role. 

Members of the helix-loop-helix (HLH) family of transcriptional regulation proteins share a 
common structural element, i.e. a stretch of 40-50 amino acids containing two short amphipathic 
alpha-helices separated by a linker region (the loop) of varying length (Murre C et al. (1989) Cell 
56:777-783). This element was initially identified as a region of homology among c-myc, the 

15 muscle determination gene MyoD (Davis RL et al. (1987) Cell 51 :987-1000) and the Drosophila 
achaete-scute complex (AS-C) involved in neural determination (Villares R. and Cabrera CV (1987) 
Cell 50:415-424). The HLH proteins form both homodirners and heterodimers by means of 
interaction between the hydrophobic residues on the corresponding faces of the two helices to give a 
parallel four-helix bundle structure (Adrian R et al. (1993) Nature, 363:38-45; Ellenberger T et al. 

20 (1994) Genes Dev. 8:970-980). The alpha helical regions are usually 15-16 amino acids long with 
hydrophobic residues at every third and fourth position, and each helix contains several conserved 
residues (Murre C et al. (1989) Cell, 56:777-783; Benezra R. et al. (1990) Cell, 61:49-59). 

The HLH protein family is subdivided into two major groups: the so-called "bHLH" and 
"non basic HLH" subfamilies. Proteins of the bHLH family contain a conserved highly basic 

25 region immediately N-terminal to the first helix (known as bHLH structure), and mutagenesis 

experiments on MyoD protein confirm that this region is responsible for sequence-specific binding 
to the "E-box", a consensus DNA motif for bHLH proteins (Davis RL. et al. (1990) Cell, 60: 733- 
746 ). A dimeric bHLH protein (either homodimeric or heterodirneric but in which both subunits 
contains a basic region) are able to bind to DNA. In general, the bHLH proteins fall into two 

30 categories: Class A consists of proteins that are ubiquitously expressed, including mammalian 
E12/E47 and fly da whereas the class B consists of proteins that are expressed in a more tissue- 
specific manner, including mammalian MyoD and fly AC-S. In most cases, the tissue-specific 
bHLH proteins preferentially heterodimerize with ubiquitous partners. 

The non basic HLH subfamily contains proteins lacking a basic region unable to bind to 

35 DNA but that could form homo- or heterodimers through their HLH motif. Indeed, heterodirneric 
complexes between non basic HLH and bHLH proteins fail to bind to DNA and negatively 
modulate the bHLH proteins-mediated transcription activation. This phenomenon was first 
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demonstrated in a MyoD/Id regulation model (Benezra R. et al. (1990) Cell, 61:49-59 ). The MyoD 
gene product is able to activate previously silent muscle-specific genes when introduced into a large 
variety of differentiated cell types. MyoD proteins form either homodimers or heterodimers with 
other bHLH proteins such as El 2 or E47, and bind to E-box consensus motif to activate 
5 myogenesis. The Id gene, conserved from batracians to mammals (Wilson R et al. (1995) 
Mech.Dev. 49:21 1-222; Sawai S et al. (1997) Mech.Dev. 65:175-185; Norton JD et al. (1998) 
trends in Cell Biology 8:58-65), lacks a basic region adjacent to its HLH motif but is able to 
specifically dimerize with either MyoD,E12 or E14 and has been shown to subsequently attenuate 
the heterodimer's ability to bind DNA. Additionally, overexpression of Id inhibits MyoD- 
10 dependent gene activation in in vivo transfection experiments. Id proteins may function either to 
repress directly the activity of tissue-restricted bHLH proteins by rendering them non-functional or, 
more likely, to sequestrate the ubiquitous bHLH proteins and preventing them from forming active 
heterodimers with the tissue-restricted bHLH (Review by Norton JD et al. (1998) trends in Cell 
biology 8:58-65). 

1 5 The possibility that the Id protein behaves as a dominant-negative regulator to repress . 

MyoD protein activity through the formation of nonfunctional heterodimeric complexes is 
considerably strengthened by the following findings in Drosophila. In Drosophila, the development 
of peripheral nervous system is positively regulated by the two structurally related bHLH proteins, 
AS-C and daughterless (da), since loss of either activity results in loss of sensory organ 

20 development. The extramacrochaetae Emc product belonging to the non basic HLH subfamily was 
shown to antagonize the activity of AS-C and da. through the formation of nonfunctional 
heterodimers with the bHLH proteins (Hillary M et al. (1990) Cell,61:27-38; Garrell J et al. (1990) 
Cell 61,39-48). 

Human Id genes including human Id 1, Id2, Id3 and Id4 have been identified and localized 
25 (Review by Norton JD et al. (1998) trends in Cell Biology 8:58-65). The bHLH proteins and Id 
proteins are thought to be involved in the regulation of apoptosis. Differentiation and development 
of T- and B-lymphocytes in immune system are positively regulated by the combination of 
ubiquitous E proteins and lymphocyte-restricted bHLH proteins. Disruption in gene expression 
from either class results in severe perturbation of T- and B-lymphocyte development (Bain G et al. 
30 (1997) Mol Cell Biol 17:4782-4791; Zhuang et al. (1996) Mol Cell Biol 16:2898-2905). Cell- 
arrested T thymocytes undergo a massive apoptosis when Idl gene is overexpressed (Kim D (1999) 
Mol Cell Biol 19(12):8240-53). Overexpression of Idl gene product also results in apoptosis in 
neonatal and adult cardiac myocytes in culture (Tanaka K et al. (1998)J Biol Chem 273(40) 25922- 
25928). 

35 Idl and Id3 proteins are also required to support angiogenesis. Quiescent adult endothelial 

cells express minimal level of the Id proteins, whereas Id expression is upregulated in angiogenic 
endothelial cells. Partial loss of these proteins in Idl +/ "Id3" A double knockout mice impairs 
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angiogenesis, resulting in the resistance to tumour growth (Lyden D et al. (1999) Nature 401 :670- 
677). In addition, a significant overexpression of mRNA and protein levels of Idl, Id2 and Id3 has 
been found in patients with pancreatic cancer (Maruyama H et ah Am J Pathol (1999) 155(3):815- 
822 ) A correlation of Idl gene upregulation and aggressive phenotype of human breast cancer 
5 cells has also been reported (Lin CQ et al. (2000) Cancer Res 60(5): 1332-40). 

Thus, identification and cloning of members of the HLH family, and especially of the non 
basic HLH subfamily, is necessary to enrich our knowledge about the biological importance of the 
HLH transcription factors network and further more to provide insights and tools in disorders linked 
to dysregulation of the HLH -mediated transcription. 

10 It is believed that the protein of SEQ ID NO: 394 or part thereof plays a role in the 

regulation of transcription activation, probably as a member of the HLH family, preferably of the 
non basic HLH subfamily. More particularly, the protein of the invention is thought to be able to 
antagonize the activity of members of the bHLH family through the formation of heterodimers. 
Preferred polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID 

15 NO: 394 from positions 13 to 28, from positions 53 to 68, and from positions 13 to 68. Other 
preferred polypeptides of the invention are fragments of SEQ ID NO: 394 having any of the 
biological activity described herein. 

The dimerization ability of the protein of the invention or part thereof which is 
characteristic of the HLH family may be assayed using any of the assays known to those skilled in 

20 the art. For example, interacting protein partners, especially members of the bHLH subfamily, may 
be identified using screening of cDNA expression libraries as described for the identification of 
some HLH transcription factors such as E12 and E47 (Murre C et al. (1989) Cell 56:777-783), Max 
(a Myc binding factor) (Elizabeth M et al. (1991) Science 251:1217) as well as Id (Benezra C et al. 
(1990) Cell 61:49-59). Alternatively, the helix-loop-helix motif in the protein of the invention 

25 could be used by those skilled in art as a "bait protein" in a well established yeast double 

hybridization system to identify its interacting protein partners in vivo from cDNA library derived 
from different tissues or cell types of a given organism. Alternatively, the protein of the invention 
or part thereof could be used by those skilled in art in mammalian cell transfection experiments. 
When fused to a suitable peptide tag such as [His] 6 tag in a protein expression vector and introduced 

30 into culture cells, this expressed fusion protein can be immunoprecipitated with its potential 
interacting proteins by using anti-tag peptide antibody. This method could be chosen either to 
identify the associated partner or to confirm the results obtained by other methods such as those just 
mentioned. 

An object of the invention relates to compositions and methods using the protein of the 
35 invention or part thereof to dysregulate gene transcription, preferably transcription mediated by 
HLH regulators either in vitro or in vivo, through overexpression of the protein of the invention 
using any means known to those skilled in the art. 
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The protein of the invention or part thereof could be used to induce apoptosis of specific 
cell-type under either physiological or pathological conditions. In a preferred embodiment, the 
apoptosis active polypeptide is added to an in vitro culture of mammalian cells in an amount 
effective to induce apoptosis. In another preferred embodiment, the apoptosis active polypeptide is 
5 expressed under the control of a promoter which may be activated under precise conditions. In 
particular, such conditional expression of an apoptosis-active polypeptide upon demand may be 
very useful to get rid of cells that have become unwanted, for example in applications where such 
cells have been used in a cellular therapy goal and have become useless. Another example of 
application is the case of expression under the control of a promoter that becomes active after 

10 infection by a given microorganism, thus resulting in the death of the infected cells only. 
Furthermore, the protein of the invention or part thereof may be useful in the diagnosis, the 
treatment and/or the prevention of disorders in which apoptosis is beneficial, including but not 
limited to disorders linked to abnormal cellular proliferation such as those described below. 

In another embodiment, the protein of the invention or part thereof can be used to diagnose, 

15 treat and/or prevent disorders linked to overexpression of HLH proteins, such as cancer and other 
disorders relating to abnormal cellular differentiation, proliferation, or degeneration, including 
hyperaldosteronism, hypocortisolism (Addison's disease), hyperthyroidism (Grave's disease), 
hypothyroidism, colorectal polyps, gastritis, gastric and duodenal ulcers, ulcerative colitis, and 
Crohn's disease, neurodegenerative disorders such as Parkinson's or Alzheimer's diseases using any 

20 methods and/or techniques described herein. In addition, the protein of the invention or part thereof 
may be used to evaluate the disease progression and the clinical treatment efficiency. The protein 
of the invention or part thereof could also be used a molecular target for anti-angiogenesis drug 
design. Inhibition of protein expression could be achieved by many means known to those skilled 
in the art including those described in the present application. For example, an antisense nucleotide 

25 or triple helix strategy could be developed to block the protein synthesis. Alternatively, the 
expressed protein of the invention might be neutralized by using specific monoclonal antibody 
using techniques known to those skilled in the art including those described in Peverali FA et al 
(1994) EMBO J. 13:4291-4301; Barone MV et al. (1994) Proc.Natl.Acad.Sci.USA 91:4985-4988; 
and Haza ET et al. (1994) J.Biol.Chem. 269:2139-2145. 

30 Protein ofSEQ ID NO: 466 (internal designation J84-4-2-0-D3-CS) 

The protein of SEQ ID NO: 466 overexpressed in liver and encoded by the cDNA of SEQ 
ID NO: 225 displays a Zinc finger motif of RING type (C3HC4) (Pfam signature from positions 41 
to 81, Prosite signature from positions 56 to 65) and a B-box zinc finger motif (pfam signature from 
positions 1 10 to 153). In addition, the protein of the invention is predicted to have a nuclear 

35 localization. 
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It is believed that the protein of SEQ ED NO: 466 or part thereof is a zinc binding protein, 
preferably able to bind nucleic acids, more preferably a transcription factor. Preferred polypeptides 
of the invention are polypeptides comprising the amino acids of SEQ ID NO: 466 from positions 41 
to 81 (Ring Zinc finger protein), and from 1 10 to 153 (B-Box domain). Other preferred 
5 polypeptides of the invention are fragments of SEQ ED NO: 466 having any of the biological 
activity described herein. 

Protein of SEQ ID NO: 267 (internal designation 1 16-1 1 1-1 -0-H9-CS) 

The protein of SEQ ID NO: 267 encoded by the extended cDNA SEQ ID NO: 26 exhibits 
an Emotif zinc finger domain, C2H2 type, from positions 185 to 202, and is thought to be localized 
1 0 in the nucleus. 

It is believed that the protein of SEQ ED NO: 267 or part thereof is a zinc binding protein, 
preferably able to bind nucleic acids, more preferably a transcription factor. Preferred polypeptides 
of the invention are polypeptides comprising the amino acids of SEQ ID NO: 267 from positions 
185 to 202. Other preferred polypeptides of the invention are fragments of SEQ ID NO: 267 having 
1 5 any of the biological activity described herein. 

Protein of SEQ ID NO: 277 (internal designation 160-103-1-0-F1 1-CS) 

The protein of SEQ ID NO: 277 encoded by the extended cDNA SEQ ID NO: 36 exhibits a 
pfam DHHC zinc finger domain from positions 140 to 204. 

It is believed that the protein of SEQ ID NO: 277 or part thereof is a zinc binding protein, 
20 preferably able to bind nucleic acids, more preferably a transcription factor. Preferred polypeptides 
of the invention are polypeptides comprising the residues of SEQ ID NO: 277 from positions 140 to 
204. Other preferred polypeptides of the invention are fragments of SEQ ID NO: 277 having any 
of the biological activity described herein. 

Protein of SEQ ID NO: 272 (internal designation 145-25-3-0-B4-CS) 

25 The protein of SEQ ID NO: 272 encoded by the extended cDNA SEQ ID NO: 3 1 shows 

homology with numerous zinc binding proteins. In addition, the protein of the invention exhibits 
the pfam RING zinc finger signature from positions 87 to 129. The protein of SEQ ID NO: 272 has 
a variant, i.e. the protein of SEQ ID NO: 273 encoded by the extended cDNA SEQ ID NO: 32 and 
thought to have the same function and utilities. 

30 It is believed that the protein of SEQ ED NO: 272 or part thereof is a zinc binding protein, 

preferably able to bind nucleic acids or proteins, more preferably a transcription factor. Preferred 
polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID NO: 272 
from positions 87 to 129. Other preferred polypeptides of the invention are fragments of SEQ ED 
NO: 272 having any of the biological activity described herein. 

131 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 PCT/1B00/01938 
Hydrolases and inhibitors 

The invention relates to compositions and methods using proteins of the invention havinf an 
hydrolytic activity, herein referred to as HYP, such as the ones described in this section and those 
containing an hydrolytic domain as shown on Table VI, or parts thereof, preferably fragments 
5 comprising an hydrolytic domain, or derivative thereof. 

The invention relates to methods and compositions using HYP or a fragment thereof to 
hydrolyze one or several substrates, alone or in combination with other substances.. For example, 
the protein of the invention or part thereof is added to a sample containing the substrate(s) in 
conditions allowing hydrolysis, and allowed to catalyze the hydrolysis of the substrate(s). 

10 Hydrolyzed substrates are then detected using standard methods known to those skilled in the arts. 
The protein of the invention or part thereof can also be added to samples as a "cocktail" with other 
hydrolytic enzymes, such as other peptidases, for example to decontaminate surgical instruments 
using methods described in US patent 5,489,531. The advantage of using a cocktail of hydrolytic 
enzymes is that one is able to hydrolyze a wide range of substrates without necessarily knowing the 

1 5 specificity of each enzyme. Using a cocktail of hydrolytic enzymes also protects a sample from a 
wide range of future unknown contaminants from a vast number of sources. Alternatively, HYP or 
part thereof may be bound to a chromatographic support, either alone or in combination with other 
hydrolytic enzymes, using techniques well known to those skilled in the art, to form an affinity 
column to remove the substrate. Immobilization facilitates removal of the enzyme from the batch 

20 of product and subsequent reuse of the enzyme. 

Immobilization of the enzyme or part thereof can accomplished, for example, by adding a 
cellulose-binding domain to the protein through the modification of the DNA sequence coding for 
the protein or part thereof. One of skill in the art will understand that other methods of 
immobilization could also be used and are described in the available literature. Alternatively, the 

25 same methods may be used to identify new substrates. 

In another embodiment, HYP or part thereof may be used to identify or quantify the amount 
of a given substrate in a biological sample. In a preferred embodiment, HYP of part thereof is 
catalytically inactived , i.e. capable of binding but not hydrolyzing a given substrate, using any of 
the methods known to those skilled in the art including those which produce a mutant enzyme, a 

30 recombinant-enzyme, or a chemically inactivated enzyme. The catalytically inactive protein of the 
invention is then incubated with an aliquot of a biological sample under conditions suitable for 
binding of the inactive enzyme to the substrate. Then, the bound enzyme is detected to assess the 
presence or amount of the eubacteria in the biological sample. In another preferred embodiment, 
HYP or part thereof is used in assays and diagnostic kits for the identification and quantification of 

35 substrates in a biological sample. These assays can be based for example, on standard enzyme- 
linked immunosorbant assays (ELISA) or any other technique known to those skill in the artln 
addition, HYP or part thereof may be used to identify, e.g. using screens based on standard assays 
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such as those described above, inhibitors of the enzyme for mechanistic and clinical applications. 
Such inhibitors may then be used to identify or quantify HYP in a sample, and to diagnose, treat or 
prevent any of the disorders where the protein's activity is undesirable and/or deleterious. 

Protein ofSEQID NO: 400 (internal designation 1 60-54-1 -0-F7-CS) 

5 The protein of SEQ ID NO:400, encoded by the cDN A of SEQ ID NO: 1 59, exhibits two 

putative transmembrane domains encompassing amino-acids 50-70 and 127-147 as predicted by the 
software TopPred II (Claros and von Heijne, CAB JOS applic. Notes, 10 :685-686 (1994)). It also 
diplays the Prosite carboxypeptidase zinc-binding region signature PS00133 at positions 1 17-127. 
It is predicted by the psort software (see Nakai K and Horton P, Trends Biochem Sci. 1999 

10 Jan;24(l):34-6) to localize to the nucleus with a high probability (73.9%). Finally it is specifically 
expressed in fetal brain and shows no homology to previously known proteins. 

Carboxypeptidase enzymes hydrolyze the terminal amino acid of a protein or peptide. A 
novel family of carboxypeptidases, localized in the nucleus and with a carboxypeptidase-dependant 
transcriptional activity, has emerged only recently. Its first member, AEBP1, was previously 

15 identified as a 3T3 preadipocyte factor implicated in the repression of the aP2 gene expression. 
AEBP1 stands for "AE-l Binding Protein," where AE-1 is a regulatory element of the adipose P2 
gene (aP2), a gene involved in triglyceride metabolism and activated in adipocytes. Its own 
expression is abolished during adipocyte differentiation (He GP et al., Nature 378:92-96(1995)). 
AEBP1 was subsequently shown to play a similar role in the differentiation of osteoblastic cell lines 

20 (Ohno I et al., Biochem Biophys Res Commun. 1996 Nov 12;228(2):41 1-4) and vascular smooth 
muscle cells (Layne MD et al., J. Biol. Chem. 273:15654-15660(1998)). It was proposed that 
AEBP1 acts as a negative transcription factor by cleaving proteins involved in transcription, a new 
feature in transcription regulation. Recent evidence further suggests that its transcriptional activity 
is itself attenuated by binding to G-proteins subunits (Park JG et al., EMBO J. 1999 Jul 

25 15;18(14):4004-12) and stimulated by DNA binding (Muise AM and Ro HS, Biochem J. 1999 Oct 
15;343 Pt 2:341-5). 

It is believed that the protein of SEQ ID NO:400 plays a role in cell signaling, nuclear 
transcriptional activity and in the differentiation of several cell types, especially those found in the 
developing brain (including but not limited to neurons). Preferred polypeptides of the invention are 

30 polypeptides having any of the biological activities described herein. 

One embodiment of the present invention relates to compositions and methods using the 
protein of the invention or part thereof as a marker for specific cell compartments (especially the 
nucleus) and/or tissue types (especially fetal brain). For example, the protein of the invention or 
part thereof may be used to generate specific antibodies which would in rum allow the visualization 

35 of nuclear structures by methods well-known to those of skill in the art. In a similar fashion, 
antibodies raised against the protein of the invention may be used to identify particular 
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developmental stages (fetal for instance) and/or given tissue types (brain for instance), as the protein 
of the invention is specifically expressed in brain tissues at a fetal stage. Antibodies and antiserum 
can also be used to inhibit undesirable carboxypeptidase activities in in vitro experiments and cell 
cultures, as well as in biological samples and in vivo. Alternatively, quantitative analysis or 
5 detection of the protein of the invention, or of nucleic acids encoding the protein, can be carried out 
by any other technique known to those skilled in the art. 

In another embodiment, the protein of the invention may be used to target heterologous 
compounds (polypeptides or polynucleotides) to the developing brain and/or the cell nucleus. For 
instance, a chimeric protein composed of the protein of the invention recombinantly or chemically 
10 fused to a protein or polynucleotide of therapeutic interest would allow the delivery of the 
therapeutic protein/polynucleotide specifically to the above-mentioned cellular/tissue targets 
(nucleus, fetal brain). 

In another embodiment, the present invention relates to methods and compositions using the 
protein of the invention or a fragment thereof to hydrolyze one or several substrates, alone or in 

15 combination with other substances. The ability of the present protein to hydrolyze any particular 
substrate can easily be determined by carrying out a hydrolysis reaction using standard assay 
techniques such as the ones decribed by Slusher et al. (Slusher et al. - Prostate - 2000, 44(1): 55- 
60) or any other technique well known to those skilled in the art. Potential substrates are any 
substance containing a peptide bond, more specifically a C-terminal peptide bond. Such substances 

20 include, but are not limited to, polypeptides, folic acid and its analogues (e.g. methotrexate). For 
example, the protein of the invention or part thereof is added to a sample containing the substrate(s) 
in conditions allowing hydrolysis, and allowed to catalyze the hydrolysis of the substrate(s). 
Hydrolyzed substrates are then detected using standard methods known to those skilled in the art. 
In a preferred embodiment, the protein of the invention or part thereof may be used to 

25 modulate cellular transcriptional activity, thereby modulating cellular differentiation. Specifically, 
as nuclear carboxypeptidases play a role in inhibiting transcription associated with differentiation, 
then an increase in the activity or expression of the protein can be used to inhibit differentiation. 
The ability to inhibit differentiation has a number of uses, for example during the cultivation of 
undifferentiated pluripotent cells to maintain the cultured cells in an undifferentiated state until the 

30 need for a given cell type arises (in cases of grafts for instance). The level of the protein activity or 
expression can be increased in any of a number of ways, including by introducing a polynucleotide 
encoding the protein into cells, by administering the protein itself to cells, or by administering to 
cells a compound that increases protein activity or expression. Alternatively, the protein of the 
invention can be inhibited, thereby enhancing cellular differentiation. The ability to promote 

35 differentiation has many uses, including in the treatment or prevention of cancer, as cancer cells are 
often in a relatively undifferentiated state, and cellular differentiation typically accompanies by 
growth arrest. 
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In another embodiment, the protein of the invention or part thereof may be used to 
diagnose, treat and/or prevent disorders where the presence of substrates, for example excess 
proteins or peptides, is undesirable or deleterious. Such disorders include but are not limited to, 
cancer, neurodegenerative disorders such as Parkinson's and Alzheimer's diseases, and diabetes. In 
5 another embodiment, the protein of the invention or part thereof may be used to identify or quantify 
the amount of a given substrate (e.g. a peptide, folic acid, or methotrexate) in a biological sample. 
In a preferred embodiment, the protein of the invention or part thereof is used in assays and 
diagnostic kits for the identification and quantification of substrates in a biological sample. 

In a most preferred embodiment, the protein of the invention or part thereof can be used in 

10 cancer chemotherapies in rescue therapy following toxic high dose methotrexate regimes. Many 
carboxypeptidases can cleave the C-terminal glutamate moiety from folic acid and its analogues, 
such as methotrexate. The key role of reduced folates as coenzymes in many biological pathways 
including those leading to DNA synthesis via the pyrimidines and purines, has made folic acid a 
target molecule for chemotherapy. Tumor cells grow rapidly and have a high rate of nucleic acid 

15 synthesis. Depletion of folic acid has cytotoxic effects, primarily in replicating tissues, and can 
inhibit growth of tumors with high folic acid requirements. Many carboxypeptidases can directly 
deplete folate by hydrolytic removal of its glutamate moiety. In cancer chemotherapy, methotrexate 
(4-amino-N ,0 -methyl-pteroyl-glutamate) is commonly used to deplete the pool of reduced folates by 
inhibiting dihydrofolate reductase (DHFR), which catalyses the reduction of folates into 

20 biologically active tetrahydrofolate form, essential in the biosynthesis of all folate coenzymes. 
Thus, the protein of the invention or part thereof could be used in rescue therapy following toxic 
high-dose regimes such as described by Widemann et al. (Widemann B. et al. — Proc. Am. Assoc. 
Cancer Res. - 1995, 36, p232) and Chabner et al. (Chabner B. et al. -Nature - 1972, 239, p395- 
397), which disclosures are hereby incorporated by reference in their entity. The basis of this 

25 strategy is that hydrolysis of methotrexate produces 4-amino-N l0 -methyl-pteroate that is about 100 
times less active as an inhibitor of DHFR. 

In another preferred embodiment, the protein of the invention or part thereof can be used in 
an enzyme/prodrug strategy to treat a number of pathologies, especially those treated with drugs 
associated with severe side effects, including, but not limited to, autoimmune diseases and chronic 

30 inflammatory diseases such as rheumatoid arthritis, and cancer chemotherapy. These side effects 
can be mainly explained by the fact that the in vivo selectivity of the drugs used is too low (for 
example, the inadequate selectivity between tumor and normal cells of most anticancer drugs is well 
known and their toxicity to normal tissues is dose limiting). In the first phase of one example of 
such a protocol, a conjugate of the protein of the invention or part thereof and an antibody to a 

35 tissue specific antigen (for example, tumor specific antigens in the case of cancer chemotherapy) is 
administered. After a delay to allow residual enzyme conjugate to be cleared from the blood, a 
relatively non-toxic compound is administered to the patient. This non-toxic compound is a 
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substrate of the protein of the invention, and is converted by the protein into a substantially more 
toxic compound. Thus, because of the previous, targeted administration of the protein of the 
invention, when the non-toxic compound is administered, the toxic compound is only produced in 
the vicinity of the cells targeted by the fusion protein. This two-phase approach has been termed 
5 antibody-directed enzyme-prodrug therapy (ADEPT), this approach is reviewed by Melton et al. 
(Melton R. et al. - J. Natl. Cancer Inst. - 1996, 88, pl53-165). Alternatively the first phase can be 
replaced by a gene therapy approach resulting in the de novo synthesis of the protein of the 
invention or part thereof by cells from the targeted tissue, this has been termed gene-dependent 
enzyme/prodrug therapy (GDEPT). Another advantage of these 2 approaches (ADEPT and 
10 GDEPT) is that a single enzyme molecule is capable of activating many prodrug molecules. 

Protein ofSeq Id No: 242 (internal designation J 19-003-4-0-C2-CS) 

The protein of SEQ ID No: 242, encoded by the cDNA of SEQ ID No: 1, is homologous to 
proteins of the M20 metallopeptidases family (EC 3.4.17.X). The protein of the invention is over- 
expressed in the spinal cord and the brain. 

15 The M20 metallopeptidase family of proteins are all peptidases (i.e. enzymes able to 

hydrolyze peptide bonds) furthermore they are all exopeptidases, which means that they can 
hydrolyze the terminal amino acid of a protein or peptide. Members of the M20 peptidase family 
are glutamate carboxypeptidases, which are capable of releasing the C-terminal glutamate residue, 
by hydrolysis, from a wide range of N-acyl groups, including peptidyl, aminoacyl, benzoyl, 

20 benzyloxycarbonyl, folyl, and pteroyl groups, and physiologically are involved in the catabolism of 
proteins. M20 carboxypeptidases are either monomeric or homodimeric (i.e. 2 identical proteins 
assembled to from the enzyme). In order to be active, metallopeptidases must be associated with a 
metallic cofactor (either Zinc or Cobalt depending on the enzyme). The most studied 
carboxypeptidase of the M20 family is carboxypeptidase G2 (CPG2) (EC 3.4.17.1 1), a bacterial 

25 enzyme from Pseudomonas sp. (strain RS-16). CPG2 is a dimeric Zinc carboxypeptidase that 
cleaves the C-terminal glutamate moiety from a number of molecules. 

The protein of SEQ ID No: 242 includes the pfam signature for M20 peptidase (position 
107 to 451). The protein of SEQ ID No: 242 also includes a number of amino acids that are 
conserved throughout the M20 protease family especially those that interact with the metal cofactor. 

30 Preferred polypeptides of the invention are polypeptides of SEQ ID No: 242 that include the highly 
conserved amino acids: 133, 135, 149, 163, 200, 201 and/or 262, which are present in over 80% of 
the members of the M20 peptidase family, and/or amino acids 139, 157, 162, 16, 367 and/or 377, 
which are present in over 60% of the members of the M20 peptidase family. Of particular interest 
are amino acids 133, 166, 201 and 262, which by homology are probably involved in the interaction 

35 with the metal cofactors. Thus it is believed that the protein of SEQ ID No: 242 or part thereof is a 
peptidase, preferably a carboxypeptidase, more preferably a metallocarboxypeptidase of the M20 
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family. Other preferred polypeptides of the invention are any fragments of SEQ ID No: 242 having 
any of the biological activities described herein. 

Determination of carboxypeptidase activity on specific substrates can easily be obtained by 
carrying out the hydrolysis using standard assay techniques such as the ones decribed by Slusher et 
5 al. (Slusher et al. - Prostate - 2000, 44(1): 55-60) or any other technique well known to those 
skilled in the art. Potential substrates are any substance containing a peptide bond, more especially 
C-terminal peptide bonds, and even more specifically, C-terminal glutamate. Such substances 
include but are not limited to peptides, folic acid and its analogues (e.g. methotrexate). 

In an embodiment the protein of the invention or part thereof could be used to develop 

10 assay tools to identify brain and spinal cord tissue since the protein of the invention is 
overexpressed in these tissues. 

In still another embodiment, the protein of the invention or part thereof may be used to 
diagnose, treat and/or prevent disorders where the presence of substrates, for example excess 
proteins, is undesirable or deleterious. Such disorders include but are not limited to, cancer, 

15 neurodegenerative disorders such as Parkinson's and Alzheimer's diseases, and diabetes. In a most 
preferred embodiment, the protein of the invention or part thereof can be used in cancer 
chemotherapies in rescue therapy following toxic high dose methotrexate regimes. Enzymes of the 
M20 peptidase family can cleave the C-terminal glutamate moiety from folic acid and its analogues, 
such as methotrexate. The key role of reduced folates as coenzymes in many biological pathways 

20 including those leading to DNA synthesis via the pyrimidines and purines, has made folic acid a 
target molecule for chemotherapy. Tumor cells grow rapidly and have a high rate of nucleic acid 
synthesis. Depletion of folic acid has cytotoxic effects, primarily in replicating tissues, and can 
inhibit growth of tumors with high folic acid requirements. Enzymes of the M20 peptidase family 
can directly deplete folate by hydrolytic removal of its glutamate moiety. In cancer chemotherapy, 

25 methotrexate (4-amino-N ,0 -methyl-pteroyl-glutamate) is commonly used to deplete the pool of 
reduced folates by inhibiting dihydrofolate reductase (DHFR), which catalyses the reduction of 
folates into biologically active tetrahydrofolate form, essential in the biosynthesis of all folate 
coenzymes. Thus the protein of the invention or part thereof could be used in rescue therapy 
following toxic high-dose regimes such as described by Widemann et al. (Widemann B. et al. — 

30 Proc. Am. Assoc. Cancer Res. - 1995, 36, p232) and Chabner et al. (Chabner B. et al. - Nature - 
1972, 239, p395-397), which disclosures are hereby incorporated by reference in their entity. The 
basis of this strategy is that hydrolysis of methotrexate produces 4-amino-N l0 -methyl-pteroate that 
is about 100 times less active as an inhibitor of DHFR. 

In another preferred embodiment, the protein of the invention or part thereof can be used in 

35 an enzyme/prodrug strategy to treat a number of pathologies, especially those treated with drugs 
associated with severe side effects, including, but not limited to, autoimmune diseases and chronic 
inflammatory diseases such as rheumatoid arthritis, and cancer chemotherapy. These side effects 
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can be mainly explained by the fact that the in vivo selectivity of the drugs used is too low (for 
example, the inadequate selectivity between tumor and normal cells of most anticancer drugs is well 
known and their toxicity to normal tissues is dose limiting). In the first phase of one example of 
such a protocol, a conjugate of the protein of the invention or part thereof and an antibody to a 
5 tissue specific antigen (for example, tumor specific antigens in the case of cancer chemotherapy) is 
administered. After a delay to allow residual enzyme conjugate to be cleared from the blood, a 
relatively non-toxic compound is administered to the patient. This non-toxic compound is a 
substrate of the protein of the invention, and is converted by the protein into a substantially more 
toxic compound. Thus, because of the previous, targeted administration of the protein of the 

10 invention, when the non-toxic compound is administered, the toxic compound is only produced in 
the vicinity of the cells targeted by the fusion protein. This two-phase approach has been termed 
antibody-directed enzyme-prodrug therapy (ADEPT), this approach is reviewed by Melton et al. 
(Melton R. et al. - J. Natl. Cancer Inst. - 1996, 88, pi 53- 165). Alternatively the first phase can be 
replaced by a gene therapy approach resulting in the de novo synthesis of the protein of the 

1 5 invention or part thereof by cells from the targeted tissue, this has been termed gene-dependent 
enzyme/prodrug therapy (GDEPT). Another advantage of these 2 approaches (ADEPT and 
GDEPT) is that a single enzyme molecule is capable of activating many prodrug molecules. 

Protein ofSEQ ID NO: 401 (internal designation 1 60-88-3-0-A8-CS.corr) 

The protein of SEQ ID NO : 401 encoded by the cDNA SEQ ID NO: 160 is a splicing 

20 variant of the hypothetical human palmitoyl -protein thioesterase-2 (PPT2) (E.C. 3.1 .2.22) (Genbank 
accession number AF020543), which is well conserved among eukaryotes (C. elegans and rodents) 
and exhibits homology with the palmitoyl protein thioesterase-1 (PPT1) (Genbank accession 
number L42809). The product of the cDNA SEQ ID NO: 160 is shorter than the human PPT2 (280 
versus 308 amino acids respectively) with a gap located between the positions 174 and 203 of the 

25 protein PPT2. The protein of SEQ ID NO : 401 has a variant, the protein of SEQ ID NO: 402 
encoded by the cDNA of SEQ ID NO: 1 61 , thought to have the same functions and utilities. 

PPT1 (E.C. 3.1.2.22) is a well-described protein, widely conserved among the murine, rat, 
bovine and human species (Swissprot accession number P50897). It is a lysosomal enzyme that 
functions in the removal of fatty acids from modified cysteine residues in proteins undergoing 

30 degradation (Hofmann S.L. et al, Neuropediatrics , 28: 27-30 (1997)). For example, PPT1 catalyses 
the deacylation H-ras and the alpha subunits of heterodimeric G proteins in vitro (Camp L.A., J. 
Biol Chern., 268: 22566-22574 (1993) and 269: 23212-23219 (1994)). Deacylation by PPT1 may 
be a prerequisite for complete digestion of the modified polypeptides. In fact there is evidence that 
palmitoylation leads to increased protection against proteolytic digestion. Both the salivary mucus 

35 glycoprotein (Slomiany B. L., Biochem. Biophys. Res. Commun., 151: 1046-1053 (1988),) and 

chemically acylated bee venom phospholipase A2 (Diaz, R.E., Biochem. Biophys. Acta, 830: 52-58 
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(1985)) are more resistant to treatment with proteinases than their deacylated forms. Mutations in 
PPT1 enzyme were shown to underlie the hereditary neurodegenerative disorder, infantile neuronal 
ceroid lipofuscinosis (Vesa et aL, Nature, 376: 584-587 (1995)). 

Recently, Soyombo and Hofmann (J.Biol.Chem, 272: 27456-27463, (1997)) described a 
5 second lysosomal thioesterase, PPT2, that shares 20% identity with PPT1 . The PPT2 enzyme 
presumably also plays a role in lysosomal thioester catabolism but has a substrate specificity 
distinct from that of PPT1. While little is known about the substrate specificity of PPT2, the 
enzyme is highly active against palmitoylated model substrates such as palmitoyl CoA. PPT2 did 
not hydrolyse the acyl-cysteine bond of the protein substrates routinely used to assay PPT1 such as 

10 H-Ras and albumin. This finding suggest that although both enzymes possess intrinsic palmitoyl 
thioesterase activity, the "leaving group" recognized by the enzymes may differ. One possibility is 
that PPT2 recognizes palmitoylated protein substrates but that these substrates differ from those 
recognized by PPT1. A second possibility is that PPT2 recognizes a novel lipid thioester substrate 
that is not derived from acylated proteins. Aguado et al. (Biochem J., 341 :679-689, (1999)) 

15 demonstrated that PPT2 is an acyl thioesterase . However they cannot distinguish between esterase 
(thioesterase) and lipase activity. PPT2 shows very high S-thioesterase activity towards the acyl 
chains Cj 4: o>C,6:o > moderate activity towards the acyl chains C U :i>C 2 o:4 « C J6 :i ~ Ci 8: o « Ci 2: o>C 18:2 
~ Cu:3>C 2 2:i « C| 8: i~ C 2 0:0 > low activity towards the acyl chains C 10: o and C 2 2:o , and no activity 
towards the acyl chain C 24: o> C 8: o , C 6:0 , C 4; o and C 2:0 - PPT2 has a broader range of action than PPT1, 

20 although both have a preference for long acyl chains (more than 12 or 14 carbons) over shorter acyl 
chains (less than 12 carbons). Aguado et aL (supra) also presented a detailed characterization of 
PPT2 gene product. The putative 302-residue PPT2 and the protein of the invention contains a 
hydrophobic leader peptide at the N-terminus (signal peptide with a cleavage site predicted at 
position 34 of the protein of the invention) suggesting that they are secretory glycoproteins. Both 

25 proteins exhibit two motifs located at the N-terminus from positions 108 to 121. One motif is 
common to triglycerides lipases (from position 1 10 to 121) and the other one to eukaryotic thiol 
(Cys) proteases (from positions 108 to 121). Triglyceride lipases are lipolytic enzymes that 
hydrolyse the ester bond of triglycerides. The most conserved region in all these proteins is centered 
on a serine residue located in a conserved Gly-Xaa-Ser-Xaa-Gly motif. The PPT2 protein and the 

30 protein of the invention contain a cysteine residue (position 115) instead of the first glycine residue 
in the motif but other lipases with one mismatch in either of the consensus have been described 
(Blow D., Nature, 343: 694-695 (1990)). In the same region as the lipase motif, PPT2 and the 
protein of the invention contains a motif common to the active site of eukaryotic thiol (Cys) 
protease but with a leucine residue (position 113) instead of the glycine at the position 5 of the 

35 pattern. In addition, the amino acid sequence of the putative PPT2 shows, at the C-terminus, from 
positions 171 to 186, a motif common to growth factor and cytokine receptors family, which is not 
present in the protein of the invention. 
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Aguado et al. (supra) have found that PPT2 is expressed in cells of the immune system as 
an approximative^ 42 kDa protein in cells extracts and supernatants and is transcribed as at least 
five different transcripts. The PPT2 gene is located in the class III region of the human MHC which 
contains several genes encoding proteins with potential roles in the immune system and in 
5 inflammation. In addition, Aguado et al (supra) showed that very large amounts of PPT2 are 
secreted. However this is not in disagreement with an intracellular activity because the secreted 
protein could be internalized into the cell through a receptor and act on target located in an 
intracellular organelle. This mechanism has been described for the secreted PPT1, which can be 
internalized into the cell by mannose-6-phosphate receptor to act in the lysosome (Verkruyse and 

10 Wofrmnn, J.BiolChem., Ill: 15831-15836, (1996)), and Soyombo and Hofmann (J.Biol. Chem, 
272: 27456-27463, (1997)) reported that PPT2 binds to mannose-6 phosphate receptor. 

Palmitoylation refers to posttranslational modification of proteins in which the most 
common fatty acids of the cell (i.e. palmitic, stearic and oleic acids) are attached to the side chain of 
cysteine residues via high-energy thioester linkages (Bizzozero, O.A. et al, Neurochem.Res., 19: 

15 923-933 (1994); Casey P.J., Science, 268: 221-225 (1995)). At present a large number of proteins of 
diverse origin, structure and function are known to be modified with these fatty acids that attach 
them to inner surface of the plasma membrane, where the can function optimally (Casey P J., 
Science, 268: 221-225 (1995)). Being anchored to membranes is a process necessary for the diverse 
cellular functions of these modified proteins, including signal transduction, vesicle transport and 

20 maintenance of the cytoarchitecture. Almost every tissue and subcellular organelle contains 
characteristic set of palmitoylated proteins. 

The protein of the invention is overexpressed in brain. In recent years a considerable 
number of functionally relevant nervous system proteins including ions channels, neurotransmitter 
receptors, signal transduction components and cell-adhesion molecules have been found to be 

25 palmitoylated. Although the nervous system is not an exception to this rule, both the number of 
modified protein in this tissue and the dynamic nature of protein palmitoylation suggest that this 
modification is critical for regulating important biological processes and that the addition or 
removal of the fatty acid serves to regulate the activity of these proteins rather that to define their 
function. 

30 It is believed that the protein of SEQ ID NO: 401 or part thereof is an hydrolase, preferably 

acting on ester bonds, more preferably a thiolester hydrolase, even more preferably an acyl- 
thioesterase which, as such, plays a role in fatty acid metabolism, in cellular vesicle transport and 
maintenance of the cytoarchitecture, in cellular proteolysis, endocytosis, signal transduction, 
lysosomal storage, cell proliferation and differentiation, immune and inflammatory response. The 

35 enzyme's substrates are compounds preferably containing an ester bond, preferably a thiol ester 
bond, more preferably an acyl thioester bond. Preferred polypeptides of the invention are 
polypeptides comprising the amino acids of SEQ ID NO: 401 from positions 108 to 121, and 1 10 to 
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121 . Other preferred polypeptides of the invention are fragments of SEQ ID NO: 401 having any of 
the biological activities described herein. The hydrolytic activity of the protein of the invention or 
part thereof may be assayed using any of the assays known to those skilled in the art including those 
described in Smith et ai, BiochemJ., 212: 155 (1983), Spencer et al., J.BioLChem., 253: 5922 
5 (1978) and Aguado et al. (supra) or in US patents 5,445,942. 

In another preferred embodiment, the protein of the invention or part thereof may be used to 
diagnose, treat and/or prevent disorders where the presence of substrates is undesirable or 
deleterious. Such disorders include but are not limited to infantile neuronal ceroid lipofuscinosis 
and lysosomal diseases. For diagnostic purposes, the expression of the protein of the invention 

10 could be investigated using any of the Northern blotting, RT-PCR or immunoblotting methods 
described herein and compared to the expression in control individuals. For prevention and/or 
treatment purposes, the expression of protein of the invention may be enhanced using any of the 
gene therapy methods described herein or known to those skilled in the art. 

In addition, the protein of the invention or part thereof may be used to identify inhibitors for 

15 mechanistic and clinical applications. Such inhibitors may then be used to identify or quantify the 
protein of the invention in a sample, and to diagnose, treat or prevent any of the disorders where the 
protein's hydrolytic activity is undesirable and/or deleterious including but not limited to lysosomal 
diseases, neurodegenerative disorder such as infantile neuronal ceroid lipofuscinosis, Parkinson's 
and Alzheimer's diseases, inflammatory and immune disorders including allergies and leukemia. 

20 Another object of the present invention are compositions and methods of targeting 

heterologous compounds, either polypeptides or polynucleotides to lysosomes by recombinantly or 
chemically fusing a fragment of the protein of the invention to an heterologous polypeptide or 
polynucleotide. Preferred fragments are any fragments of the protein of the invention, or part 
thereof, that may contain targeting signals for lysosomes such as those described in Vitale et al, 

25 Mol.Cell.Biol, 20: 7342-52 (2000), Blagoveshchenskaya et al., J.Biol.Chem., 273: 2729-37 (1998) 
and Kornfeld, FASEB J., 1: 462-8 (1987)). Such heterologous compounds may be used to modulate 
lysosomal activity. For example, they may be used to induce and/or prevent a lysosomal protein 
degradation. Moreover, antibodies binding to the protein of the invention or part thereof may be 
used for detection of the lysosomes using any techniques known to those skilled in the art. 

30 In still another embodiment, the invention relates to methods and compositions using the 

protein of the invention or part thereof as a marker protein to selectively identify tissues, preferably 
brain tissues. For example, the protein of the invention or part may be used to synthesize specific 
antibodies using any techniques known to those skilled in the art including those described therein. 
Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 

35 forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 
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Another embodiment of the present invention relates to methods and compositions using the 
protein of the invention or part thereof to modify plant lipid composition using any assay known to 
those skilled in the art including those described by the US patents 5,955,650, 5,945,585 and 
5,807,893. Indeed, plant lipids have a variety of nutritional uses and many recent research efforts 
5 have examined the role that saturated and unsaturated fatty acids play in reducing the risk of 
coronary heart disease. In the past, it was believed that mono-unsaturates, in contrast to saturates 
and poly-unsaturates, had no effect on serum cholesterol and coronary heart disease risk. Several 
recent human clinical studies suggest that diets high in mono-unsaturated fat and low in saturated 
fat may reduce the "bad" (low-density lipoprotein) cholesterol while maintaining the "good" (high- 

1 0 density lipoprotein) cholesterol (Mattson et al , Journal of Lipid Research, 26: 1 94-202 (1 985)). 

In still another embodiment , the protein of the invention or part thereof may be used in 
enzyme replacement therapy, due to the ability of cells to take up exogeneously supplied protein 
and target it to lysosomes (Neufeld E.F., Annu.Rev.Biochem. 60: 257-280(1991), Brady R.O. et al, 
JJnher.Metab.Dis. 17: 510-519 (1994)), or in bone-marrow transplantation (Hoogerbrugge P.M. et 

15 al, Lancet, 345: 1398-1402 (1995)), as bone-marrow-derived microglial cells are believed to 
penetrate the blood-brain barrier and may theoretically be able to provide sufficient enzyme to • 
correct the metabolic defect in neurons (Krivit W., Cell transplant., 4: 385-392 (1995)). The protein 
of the invention or part thereof may be also used in genetic engineering of transplanted cells 
(Salvetti A. et al, Br.MedJ. 51: 106-122 (1995)) or neural progenitor cell engraftment (Snyder 

20 E.Y., Nature, 374: 367-370 ( 1 995)) using any technique known to those skilled in the art. 

Protein ofSEQ ID NO: 254 (internal designation 1 06-006- 1-0-E3-CS) 

Angiogenin is a member of the pancreatic Rnase superfamily of proteins. Its mechanism of 

action is postulated to involve multiple interactions with other proteins through specific regions on 

the molecular surface of angiogenin. Potential partners of angiogenin include heparin, plasminogen, 
25 elastase, angiostatin, actin, and a 170 kDa receptor on the surface of endothelial cells [Strydom, D. 

J. (1998) Cell. Mol. Life Sci. 54, 81 1-824]. 

Angiogenin is required for the process of angiogenesis. Tumor growth requires 

angiogenesis, and several an ti -angiogenic agents have been produced and are currently in the 

clinical trial stage. It has also been shown that recuirent gastric cancer patients had a much higher 
30 serum concentration of angiogenin than primary gastric cancer patients [Shimoyama, S. and 

Kaminishi, M. (2000) J. Cancer Res. Clin. Oncol. 126, 468-474]. Therefore, angiogenin can be used 

as a diagnostic marker for the evaluation of cancer aggressiveness or as an early marker for 

recurrence over a follow-up period. 

Angiogenin is a potent inducer of angiogenesis [Fett, J. W.; Strydom, D. J.; Lobb, R. R.; 
35 Alderman, E. M.; Bethune, J. L.; Riordan, J. F.; and Vallee, B. L. (1985) Biochemistry 24, 5480- 

5486]. Angiogenesis is a complex process of blood vessel formation comprising of several separate 
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but interconnected steps at the cellular and biochemical level including: (i) activation of endothelial 
cells by the action of an angiogenic stimulus, (ii) adhesion and invasion of activated endothelial 
cells into the surrounding tissues and migration toward the source of the angiogenic stimulus, and 
(iii) proliferation and differentiation of endothelial cells to form a new microvasculature [Folkman, 
5 J. and Shing, Y. (1992) J. Biol. Chem. 267, 10931-10934; Moscatelli, D. and Riflcin, D. B. (1988) 
Biochim. Biophys. Acta 948, 67-85]. 

Angiogenin has been demonstrated to induce most of the individual events in the process of 
angiogenesis including binding to endothelial cells [Badet, J.; Soncin, F.; Guitton, J.D.; Lamare, O.; 
Cartwright, T.; and Bairitault, D. (1989) Proc. Natl. Acad. Sci U.S.A. 86, 8427-8431], stimulating 

10 second messengers [Bicknell, R. and Vallee, B. L. (1988) Proc. Natl. Acad. Sci. U.S.A. 85, 5961- 
5965], mediating cell adhesion [Soncin, F. (1992) Proc. Natl. Acad. Sci. U.S.A. 89, 2232-2236], 
activating cell-associated proteases [Hu, G. F. and Riordan, J. F. (1993) Biochem. Biophys. Res. 
Commun. 197, 682-687], inducing cell invasion [Hu, G-F.; Riordan, J. F.; and Vallee, B. L. (1994) 
Proc. Natl. Acad. Sci. U.S.A. 91, 12096-12100], inducing proliferation of endothelial cells [Hu, G- 

15 F.; Riordan, J. F.; and Vallee, B. L. (1997) Proc. Natl. Acad. Sci. U.S.A. 94, 2204-2209] and 
organizing the formation of tubular structures from the cultured endothelial cells [Jimi, S-L; Ito, K- 
L; Kohno, K.; Ono, M.; Kuwano, M.; Itagaki, Y.; and Isikawa, H. (1985) Biochem. Biophys. Res. 
Commun. 211, 476-483], Angiogenin has also been shown to undergo nuclear translocation in 
endothelial cells via receptor-mediated endocytosis [Moroianu, J. and Riordan, J. F. (1994) Proc. 

20 Natl. Acad. Sci. U.S.A. 91, 1677-1681] and nuclear localization sequence-assisted nuclear import 
[Moroianu, J. and Riordan, J. F. (1994) Biochem. Biophys. Res. Commun. 203, 1765-1772]. 

While angiogenesis is a tightly-controlled process under usual physiological conditions, 
abnormal angiogenesis can have devastating consequences in pathological conditions such as 
arthritis, diabetic retinopathy and tumor growth. It is now well-established that the growth of 

25 virtually all solid tumors is angiogenesis dependent [Folkman, J. (1989) J. Natl. Cancer Inst. 82, 4- 
6], Angiogenesis is also a prerequisite for the development of metastasis, since it provides the 
means whereby tumor cells disseminate from the original primary tumor and establish at distant 
sites [Mahadevan, V. and Hart, I. R. (1990) Rev. Oncol. 3, 97-103; Blood, C. H. and Zetter B. R. 
(1990) Biochim. Biophys. Acta 1032, 89-118]. Therefore, interference with the process of tumor- 

30 induced angiogenesis can be an effective therapy for both primary and metastatic cancers. 

Although originally isolated from medium conditioned by human colon cancer cells (Fett et 
al. (1985), supra), and subsequently shown to be produced by several other histological types of 
human tumors [Rybak, S. M.; Fett, J. W.; Yao, Q-Z.; and Vallee, B. L. (1987) Biochem. Biophys. 
Res, Commun. 146, 1240-1248; Olson, K. A.; Fett, J. W.; French, T. C; Key, M. E.; and Vallee, B. 

35 L. (1995) Proc. Natl. Acad. Sci. U.S.A. 92, 442-446], angiogenin also is a constituent of human 
plasma and normally circulates at a concentration of 250-360 ng/ml [Shimoyama, S.; Gansauge, F.; 

143 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 PCT/IB00/01938 

Gansauge, S.; Negri, G.; Oohara, T.; and Beger, H. G. (1996) Cancer Res. 56, 2703-2706; Blaser, 
J.; Triebl, S.; Kopp, C.; and Tschesche, H. (1993) Eur. J. Clin. Chem. Clin. Biochem. 31, 513-516]. 

Several inhibitors of the functions of angiogenin have been developed. These include: (i) 
monoclonal antibodies (mAbs) [Fett, J. W.; Olson, K. A.; and Rybak, S. M. (1994) Biochemistry 
5 33, 5421-5427], (ii) an angiogenin-binding protein [Hu, G-F.; Chang, S-I.; Riordan, J. F.; and 
Vallee, B. L. (1991) Proc. Natl. Acad. Sci. U.S.A. 88, 2227-2231; Hu, G-F.; Strydom, D. J.; Fett, J. 
W.; Riordan, J. F.; and Vallee, B. L. (1993) Proc. Natl. Acad. Sci. U.S.A. 90, 1217-1221; Moroianu, 
J.; Fett, J. W.; Riordan, J. F.; and Vallee, B. L. (1993) Proc. Natl. Acad. Sci. U.S.A. 90, 3815-3819], 
(iii) the placental ribonuclease inhibitor (PRI) [Shapiro, R. and Vallee, B. L. (1987) Proc. Natl. 

10 Acad. Sci. U.S.A. 84, 2238-2241], (iv) peptides synthesized based on the C-terminal sequence of 
angiogenin [Rybak, S. M.; Auld, D. S.; St. Clair, D. K.; Yao, Q-Z.; and Fett, J. W. (1989) Biochem. 
Biophys. Res. Commun. 162, 535-543], and (v) inhibitory site-directed mutagenesis of angiogenin 
[Shapiro, R. and Vallee, B. L. (1989) Biochemistry 28, 7401-7408]. 

The subject invention provides the protein/polypeptide of SEQ ED NO: 254. The invention 

15 also provides biologically active fragments of SEQ ID NO: 254. In one embodiment, the 

polypeptides of SEQ ID NO: 254 are interchanged with the corresponding polypeptides encoded by 
the human cDNA of clone 1 06-006-1 -0-E3-CS. "Biologically active fragments" are defined as 
those peptide or polypeptide fragments having at least one of the biological functions of the full 
length protein (e.g., stimulation of angiogenesis). Compositions of the protein/polypeptide of SEQ 

20 ID NO: 254, or biologically active fragments thereof, are also provided by the subject invention. 
These compositions may be made according to methods well known in the art. 

The invention also provides variants of the protein of SEQ ED NO: 254. These variants 
have at least about 80%, more preferably at least about 90%, and most preferably at least about 95% 
amino acid sequence identity to the amino acid sequence encoded by SEQ ID NO: 254. Variants 

25 according to the subject invention also have at least one functional or structural characteristic of the 
protein of SEQ ID NO: 254. The invention also provides biologically active fragments of the 
variant proteins. Compositions of variants, or biologically active fragments thereof, are also 
provided by the subject invention. These compositions may be made according to methods well 
known in the art. Unless otherwise indicated, the methods disclosed herein can be practiced 

30 utilizing the protein encoded by SEQ ID NO: 254, biologically active fragments of SEQ ID NO: 
254, variants of SEQ ID NO: 254, and biologically active fragments of the variants. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence of SEQ ID NO: 254. In a preferred embodiment, SEQ ID NO: 254 
is encoded by clone 106-006-1 -0-E3-CS. It is well within the skill of a person trained in the art to 

35 create these alternative DNA sequences which encode proteins having the same, or essentially the 
same, amino acid sequence. These variant DNA sequences are, thus, within the scope of the subject 
invention. As used herein, reference to "essentially the same" sequence refers to sequences that 
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have amino acid substitutions, deletions, additions, or insertions that do not materially affect 
biological activity. Fragments retaining one or more characteristic biological activity of the protein 
encoded by clone 1 06-006- 1-0-E3-CS are also included in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
5 protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

In one aspect of the subject invention, SEQ ID NO: 254, and variants thereof, can be used 
10 to generate polyclonal or monoclonal antibodies. Both biologically active and immunogenic 
fragments of SEQ ED NO: 254, or variant proteins, can be used to produce antibodies. Polyclonal 
and/or monoclonal antibodies can be made according to methods well known to the skilled artisan. 
Antibodies produced in accordance with the subject invention can be used in a variety of detection 
assays known to those skilled in the art. The antibodies may be used to agonize or antagonize the 
15 biological activity of the protein of SEQ ID NO: 254. 

SEQ ID NO: 254 can be used as a marker for individuals at risk for the development or 
recurrence of tumors. As indicated supra, angiogenin is found at certain levels in normal 
individuals, normally at concentrations of 250-360 ng/ml. Thus, quantitative immunoassays can be 
used for the detection of abnormal levels of SEQ ID NO: 254, thereby identifying those individuals 
20 at risk for the development of tumors. Alternatively, the subject invention provides antibodies 
specific for SEQ ID NO: 254, or fragments thereof, which are used in routine immunoassays to 
screen for the presence or absence of SEQ ID NO: 254, or fragments thereof. 

Alternatively, the nucleic acids which encode SEQ ID NO: 254, or fragments thereof, may 
be used in hybridization assays to detect and/or quantitate the expression of SEQ ID NO: 254. Such 
25 hybridization assays are well known to the skilled artisan and can be practiced on a variety of 
samples, including, but not limited to, tumor cells, biopsied tissues, or normal tissue. 

Molecules (see Strydom, D. J., (1998) Cell. Mol. Life Sci. 54, 811-824) that functionally 
inhibit the action of angiogenin can be used to treat patients with tumors. Because angiogenin is 
required for the vascularization of tumors, molecules which inhibit the biological activity of 
30 angiogenin can be used to reduce tumor vascularization and control tumor growth. Thus, another 
aspect of the invention provides molecules which inhibit, or reduce, the biological activity of SEQ 
ID NO: 254. One embodiment provides neutralizing antibodies to inhibit the biological activity of 
SEQ ED NO: 254. These neutralizing antibodies may be chimeric or humanized, according to 
methods well known in the art, to minimize the immunogenicity of the molecules when used in 
35 patients. Neutralizing antibodies may be used in conjunction with other known therapeutic 
modalities for the treatment of tumors. 
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Another embodiment of the invention utilizes the concept that expression of specific genes 
can be suppressed by oligonucleotides having a nucleotide sequence complementary to the mRNA 
transcript of the target gene. This suppression occurs by selectively impeding translation and has 
been termed an "antisense" methodology. In addition, "antigene" or "triplex" methodologies may 
5 also suppress expression of genes by using an oligonucleotide which is complementary to a selected 
site of double stranded DNA, thereby forming a triple-stranded complex to selectively inhibit 
transcription of the gene. Both "antisense" and "antigene" methodologies can be used to inhibit or 
reduce the expression of the gene of SEQ ID NO: 254, and thereby provide therapeutic benefit to 
the patient being treated. Methods of treating individuals using antigene and antisense 

10 methodologies are well known to those skilled in the art (see, for example, "Antisense 
Therapeutics" Agrawal, S. (ed), Humana Press, 1996; Crooke, S. T., and Bennett, C. F. (1996) 
Annu. Rev. Pharmacol. Toxicol. 36, 107-129; "Prospects for the Therapeutic Use of Antigene 
Oligonucleotides", Maher, L. J. (1996) Cancer Investigation 14(1), 66-82 each hereby incorporated 
by reference in its entirety). 

15 As additional examples, U.S. Pat. No. 5,098,890 is directed to antisense oligonucleotides 

complementary to the c-myb oncogene and antisense oligonucleotide therapies for certain cancerous 
conditions. U.S. Pat. No. 5,135,917 provides antisense oligonucleotides that inhibit human 
interleukin-1 receptor expression. U.S. Pat. No. 5,087,617 provides methods for treating cancer 
patients with antisense oligonucleotides. U.S. Pat. No. 5,166,195 provides oligonucleotide 

20 inhibitors of HIV. U.S. Pat. No. 5,004,810 provides oligomers capable of hybridizing to herpes 
simplex virus Vmw65 mRNA and inhibiting replication. U.S. Pat. No. 5,194,428 provides antisense 
oligonucleotides having antiviral activity against influenza virus. U.S. Pat. No. 4,806,463 provides 
antisense oligonucleotides and methods using them to inhibit HTLV-III replication. U.S. Pat. No. 
5,286,717 is directed to a mixed linkage oligonucleotide phosphorothioates complementary to an 

25 oncogene. U.S. Pat. No. 5,276,019 and U.S. Pat. No. 5,264,423 are directed to phosphorothioate 
oligonucleotide analogs used to prevent replication of foreign nucleic acids in cells. Each of these 
patents is hereby incorporated by reference in its entirety. 

The subject invention also provides modified/derivatized nucleic acids encoding SEQ ID 
NO: 254. These include those modifications which increase the stability and/or affinity of these 

30 compounds for targets. Phosphorothioate analogs of oligodeoxynucleotides (ODNs), in which 
nonbridging phosphoryl oxygens in the backbone of DNA are substituted with sulfur ( [S]ODNs) 
are substantially more stable than their native phosphodiester counterparts. Other derivatives, such 
as those alkylated on sugar oxygen groups, show enhanced target affinity. [S]ODNs possess good 
biological activity, pharmacology, pharmacokinetics and safety in vivo (Agrawal (1996), supra). 

35 Successful inhibition of specific gene function has been achieved by targeting various sites on 
specific mRNA sequences that include the AUG translational initiation codon, 5 '-transcriptional 
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start site, 3 , -termination codon and sequences in both the 5' and 3'-untranslated regions. These 
derivatized nucleic acids can be used in any of the aforementioned methodologies. 



Protein ofSEQID: 387 (internal designation I05-073-2-0-A 7-CS) 

The protein of SEQ ID NO : 387 encoded by the cDNA of SEQ ED NO: 146 is expressed in 
5 liver, ovary, prostate and overexpressed in salivary glands. The protein of SEQ ED NO : 387 

belongs to the abhydrolase family, and is caracterized by the alpha/beta hydrolase fold (Protein Eng 
1992;5:197-21 1, which disclosure is hereby incorporated by reference in its entirety), that is 
common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic 
function. 

10 The core of each enzyme is an alpha/beta-sheet (rather than a barrel), containing 8 strand 

connected by helices. The enzymes are believed to have diverged from a common ancestor, 
preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of 
which are borne on loops, which are the best conserved structural features of the fold. 

Epoxide hydrolases are a family of enzymes which hydrolyze a variety of exogenous and 

15 endogenous epoxides to their corresponding diols. The epoxide hydrolase add water to epoxides, 
forming the corresponding diol. On the basis of sequence similarity, it has been proposed that the 
mammalian soluble epoxide hydrolase contain 2 evolutionarily distinct domains, the N-terminal 
domain is similar to bacterial haloacid dehalogenase, while the C-terminal domain is similar to 
soluble plant epoxyde hydrolase, microsomal epoxide hydrolase, and bacterial haloalcane 

20 dehalogenase (DNA Cell Biol. 14 :61-71 (1995), which disclosure is hereby incorporated by 
reference in its entirety. Human epoxide hydrolase catalyse the addition of water to epoxides to 
form the corresponding dihydrodiol. The enzymatic hydratation is essentially irreversible and 
produces mainly metabolites of lower reactivity that can be conjugated and excreted. The reaction 
of epoxide hydrolase is therefore generally regarded as detoxifying. Commonly the function of 

25 epoxide hydrolase is finally followed by excretion of the diols. However, reactivation of certain 

diols by a second epoxidation may happen. Epoxide hydrolase inactivates also the epoxides existing 
in the metabolism of endogenous compounds. Lipophilic xenobiotics tend to accumulate into 
tissues, and they must be transformed to water soluble compounds to enable the excretion. In this 
transformation process reactive intermediates are produced. If biotransformation fails to detoxify 

30 these reactive intermediates, they may react covalently with critical targets like the genetic material, 
or start harmful reaction chains like lipid peroxidation. Therefore, epoxide hydrolases are thought to 
be responsible for carcinogenicity and mutagenicity phenomenon (Exp Pathol 1990;39(3-4): 195-6.). 
In addition, the interaction between epoxide hydrolase activity and alcohol-metabolizing enzymes, 
suggests that epoxide hydrolase activity may be associed with the susceptibility to alcoholic liver 

35 disease and hepatocellular carcinoma (Toxicol. Lett. 10 ;1 15 (1) :17-22 (2000), which disclosure is 
hereby incorporated by reference in its entirety ). Compounds containing the epoxide functionality 
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have become common environmental contaminants because of their wide use as pesticides, 
sterilants, and industrial precursors. Such compounds also occur as products, by-products, or 
intermediates in normal metabolism and as the result of spontaneous oxidation of membrane lipids 
(i.e. see, Brash, et al., Proc. Natl. Acad. Sci., 85:3382-3386 (1988), and Sevanian, A., et al., 
5 Molecular Basis of Environmental Toxicology (Bhatnager, R. S., ed.) pp. 213-228, Ann Algor 
Science, Michigan (1980)). As three-membered cyclic ethers, epoxides are often very reactive and 
have been found to be cytotoxic, mutagenic and carcinogenic (i.e. see Sugiyama, S., et al., Life Sci. 
40:225-231 (1987)). Cleavage of the ether bond in the presence of electrophiles often results in 
adduct formation. As a result, epoxides have been implicated as the proximate toxin or mutagen for 
10 a large number of xenobiotics. Reactions of detoxification using epoxide hydrolases typically 
decrease the hydrophobicity of a compound, resulting in a more polar and thereby excretable 
substance. 

It is believed that the protein of SEQ ID NO: 387 or part thereof is an hydrolase, preferably 
an epoxyde hydrolase. Preferred polypeptides of the invention are polypeptides comprising the 

15 amino acids of SEQ ID NO: 387 from positions 2 to 132, 52 to 137, 29 to 120, 12 to 137, 19 to 136, 
151 to 209, 141 to 209, 30 to 108, and 35 to 108. Other preferred polypeptides of the invention are 
fragments of SEQ ID NO: 387 having any of the biological activity described herein. The 
hydrolytic activity of the protein of the invention or part thereof may be assayed using any of the 
assays known to those skilled in the art including those described in Cancer res 40(7):2552-6 

20 (1980); Exp Pathol 39(3-4): 195-6 (1990), which disclosures are hereby incorporated by reference 
in their entireties. 

The invention also relates to methods and compositions using the protein of the invention or 
part thereof to diagnose, prevent and/or treat several disorders linked to overexpression of the 
protein of the invention including alcoholic liver disease, hepatocellular carcinoma, ovarian and 
25 prostate cancers. 

In addition, the protein of the invention or part thereof may be used to identify inhibitors for 
mechanistic and clinical applications. Such inhibitors may then be used to identify or quantify the 
protein of the invention in a sample, and to diagnose, treat or prevent any of the disorders where the 
protein's hydrolytic activity is undesirable and/or deleterious such as disorders characterized by 

30 tissue degradation including but not limited to amyloidosis, colitis, lysosomal diseases, arthritis, 
muscular dystrophy, inflammation, tumor invasion, glomerulonephritis, parasite-borne infections, 
Alzheimer's disease, periodontal disease, and cancer metastasis. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, preferably ovarian, 

35 liver or prostate, more preferably salivary glands. For example, the protein of the invention or part 
may be used to synthesize specific antibodies using any techniques known to those skilled in the art. 
Such tissue specific antibodies may then be used to identify tissues of unknown origin, for example, 
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forensic samples, differentiated tumor tissue that metastasized to foreign bodily, or to differentiate 
different tissue types in a tissue cross-section using immunochemistry. 



Protein of SEQ ID No: 398 (internal designation: 160-31-3-0-E4-CS) 

The protein of SEQ ID No: 398 encoded by the cDNA of SEQ ID No: 1 57, is 
5 overexpressed in fetal brain and shows homology with diverse hydrolases. The protein of the 
invention also displays a motif characteristic of isochorismatase proteins from positions 17 to 147. 
In addition, the protein of the invention is an alternatively spliced form of an unnamed human 
protein. 

It is believed that the protein of SEQ ED NO: 398 or part thereof is an hydrolase, preferably 
10 acting on ether bonds, more preferably an ether hydrolase. Preferred polypeptides of the invention 
are polypeptides comprising the amino acids of SEQ ID NO: 398 from positions 17 to 147. Other 
preferred polypeptides of the invention are fragments of SEQ ID NO: 398 having any of the 
biological activity described herein. The hydrolytic activity of the protein of the invention or part 
thereof may be assayed using any of the assays known to those skilled in the art including those 
15 described in US patents 5,445,942; 5,445,956, 6,017,746 and 5,871,616 and in Rusnak et al, 1990; 
Biochemistry 29 1425-1435. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, preferably fetal 
brain. For example, the protein of the invention or part may be used to synthesize specific 
20 antibodies using any techniques known to those skilled in the art including those described therein. 
Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 
forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 

Proteins of SEQ ID NOs: 260 and 265 (internal designation 1 16-004- 3-0- A6-CS and 116-091-1-0- 
25 D9-CS respectively) 

The protein of SEQ ED NO: 260 encoded by the cDNA SEQ ID NO: 19 and over expressed 
in liver and testis is an isoform of the protein of SEQ ED NO: 265 encoded by the cDNA SEQ ID 
NO: 24 over expressed in liver. Both proteins show homology to murine EPCS26 (Hemberger M. 
et al., Dev. Biol. 222, 158-169 (2000)) with Genbank accession number AF250838. The proteins of 
30 SEQ ID NO: 260 and 265 contain a signal peptide (cleavage site at position 1 8) that could allow the 
export of the protein to the extracellular domain, the export to a cellular membrane or to define a 
particular subcellular localization. The cDNA encoding EPCS26 has been shown to be differentially 
expressed during the process of trophoblast invasion. 

Implantation and placentation are key processes in mammalian embryonic development. 
35 They physically connect the embryo to its mother and are critical for sufficient nutrient and gaz 
exchange. The extraembryonic cell lineage is the first to differentiate in the developing conceptus, 
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reflecting the importance of this cell for the establishment of fetal -maternal connections. During 
murine development, the outer layer of blastocyt, the mural trophectoderm, begins to differentiate 
into primary trophoblast giant cells on day 5 of gestation (e5). These cells invade the uterine 
epithelium and penetrate deeply into the stroma. At the same time, the polar trophectoderm cells 
5 continue to proliferate and form the ectoplacental cone. On e7, the outer cells of the ectoplacental 
cone begin to differentiate into secondary trophoblast giant cells. The invasion of uterine stroma by 
these cells is critical for successful placentation (Cross et al., Science 266, 1508-1518 (1994)). 

Trophoblast invasion triggers secretion of proteinases that degrade extracellular matrix 
molecules. Mouse trophoblasts have been shown to synthesize and secrete serine proteases, matrix 
10 metalloproteinases and cysteine proteinases. Invasion of the trophoblast is a highly controlled 
process. The decidula restricts invasion by secreting proteinases inhibitors. Proteinases and 
proteinases inhibitors have antagonistic functions in implantation and placentation which may be 
mirrored by the reciprocity of their expression patterns (Alexander et al Development 122, 1723- 
1736(1996)). 

1 5 During tumor invasion and metastasis, the degradation of the basement membranes is often 

accomplished by the proteinases implicated in implantation and normal trophoblast invasion 
(Strickland and Richards Cell 71, 355-357 (1992), Wilson et al. Proc. Natl. Acad. Sci. USA 94, 
1402-1407 (1997)). Uncontrolled trophoblast invasion, as in choriocarcinomas, results in one of the 
most metastatic tumors known (Strickland and Richards Cell 71, 355-357 (1992)). 

20 A deficient fonction of the protein of the invention could result in an uncontrolled 

trophoblast invasion, and like in choriocarcinomas results in one of the most metastatic tumors 
known (Strickland and Richards Cell 71, 355-357 (1992)). 

It is believed that the proteins of SEQ ID NO: 260 and 265 or part thereof play a role in 
proteolysis, preferably during embryogenesis, more preferably during trophoblast invasion. The 

25 proteins of the invention or part thereof may act as secreted proteinases that degrade extracellular 
matrix molecules or at the contrary as proteinase inhibitors. Preferred polypeptides of the invention 
are polypeptides comprising the amino acids of SEQ ID NO: 260 from positions 7 to 122 and the 
amino acids of SEQ ID NO: 265 from positions 7 to 81 . Other preferred polypeptides of the 
invention are fragments of SEQ ID NO: 260 and 265 having any of the biological activities 

30 described herein. The proteolytic activity of the proteins of the invention or part thereof may be 
assayed using any of the assays known to those skilled in the art including those described in US 
patent 6,069,229 and 5,861,267. The protease inhibitor activity of the proteins of the invention or 
part thereof may be assayed using any of the assays known to those skilled in the art and using 
methods for determining inhibition constants well known to those skilled in the art (see Fersht, 

35 ENZYME STRUCTURE AND MECHANISM, 2nd ed., W.H. Freeman and Co., New York, 
(1985)) 
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In addition, the proteins of the invention or part thereof may be used to diagnose, treat or 
prevent any of the disorders characterized by undesirable and/or deleterious hydrolytic activity such 
as disorders characterized by tissue degradation including but not limited to amyloidosis, colitis, 
lysosomal diseases, arthritis, muscular dystrophy, inflammation, tumor invasion, 
5 glomerulonephritis, parasite-borne infections, Alzheimer's disease, periodontal disease, cancer 
metastasis, and choriocarcinoma. For diagnostic purposes, the expression of the proteins of the 
invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 
methods described herein and compared to the expression in control individuals. Alternatively, 
inhibitors for the proteins' activity may be developed and use to inhibit and/or reduce its activity 

10 using any methods known to those skilled in the art. Overexpression of the proteins of the 

invention or part thereof may be achieved using any of the gene therapy method described herein. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the inventions or part thereof as a marker protein to selectively identify tissues, preferably liver 
and testis for the protein of SEQ ID NO: 260, preferably liver for the protein of SEQ ID NO: 265. 

15 For example, the proteins of the invention or part may be used to synthesize specific antibodies 
using any techniques known to those skilled in the art including those described therein. Such 
tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 
forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 

20 Protein of SEQ ID NO: 265 (internal designation 1 1 6-088-4-0- A9-CS) 

The protein of SEQ ID NO: 265 encoded by the cDNA of SEQ ID NO: 24 is overexpressed 
in testis and liver. This protein of the invention is homologous to the GdX protein, also named 
UBL4 (Toniolo et al., Proc Natl Acad Sci USA 1988;85:851-5), found in both human (GENPEPT 
accession number L44140) and mice species (GENPEPT accession number J04761). In addition, 

25 the 174-amino-acid-long protein of SEQ ID NO: 265, which is similar in size to ubiqui tin-like 

proteins, displays a pfam consensus domain from position 1 to 82 that is the hallmarks of ubiquitin 
family proteins. 

Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells, and which is 
extremely well conserved from protozoan to vertebrates (Jentsch et al. Trends Cell Biol 

30 2000;10:335-42). It plays a key role in a variety of cellular processes, such as ATP-dependent 
selective degradation of cellular proteins, maintenance of chromatin structure, regulation of gene 
expression, stress response, ribosome biogenesis, cell-cycle progression, signal transduction, 
transcription and antigen presentation (Wilkinson et al. Annu Rev Nutr 1995;15:161-89). The first 
ubiquitin is covalently ligated to target proteins through an isopeptide linkage between the C- 

35 terminal glycine residue of ubiquitin and an internal e-amino group of lysine residue of the 

substrate. To generated an efficient proteasomal targeting signal, additional ubiquitin are linked to 
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the first one by isopeptide bounds, and form branched poly-ubiquitin complexes (Thrower et al. 
EMBO J 2000; 19: 94-102). Covalent binding of ubiquitin to proteins marks them for subsequent 
degradation by a multicomponent enzymatic complex known as the 26S proteasome (Hershko et al. 
Annu RevBiochem 1992;61:761-807). 
5 The genes coding ubiquitin-like proteins fall into two separate classes (Hershko et al. Annu 

Rev Biochem 1992;61:761-807). Proteins of the first class are frequently designed as ubiquitin-like 
modifiers, or UBLs. They produce polyubiquitin molecules consisting of exact head to tail repeats 
of ubiquitin, with a variable number of repeats. These linear polymer of ubiquitin are linked 
covalently through peptide bonds between the C-terminal glycine residue and N-terminal lysine 

10 residue of contiguous ubiquitin molecules. Proteins of the second class are habitually named as 
ubiquitin-domain proteins, or UDPs. These proteins bear a single domain of the N-terminal domain 
that is related to ubiquitin, fused to a C-terminal ribosomal domain consisting of 52 or 76-80 amino- 
acid residues (Finley et al. Nature 1989;338:394-401). These proteins are not conjugated to other 
proteins and function as an heterogeneous group of proteins. To date, this family includes RAD23, 

15 DSK2, PLIC-1, PLIC-2/Chapl, XDRP1, BAG-1, BAT3/Chap2, Scythe, Parkin, UEP28, UBP6, 

Elongin B, and GdX. In addition, the protein of invention of SEQ ID NO: 265 clearly belongs to the 
UDPs family, as it displays a single ubiquitin N-terminal consensus domain, which is the hallmark 
of this protein family subset. 

UDPs participate to regulation of proteolysis through multiple mechanisms such as 

20 interaction with catalytically active 26S proteasome for RAD23 (Schauber et al. Nature 

1998;391:715-8), hPLIC-1 and hPLIC-2 (Kleijnen et al. Mol Cell 2000;6:409-19), and BAG-1 
(Luders et al. J Biol Chem 2000;275:4613-7), removing ubiquitin from conjugates for UBP6 
(Wyndham et al. Protein Sci 1997;8:1268-75) and negative regulation of multi-ubiquitin chain 
assembly for RAD23 (Ortolan et al. Nature cell Biol 2000; 2:601-8). In addition, an increasing body 

25 of evidence indicates that some UDPs participate to other cellular functions as protein folding 

(Luders et al. J Biol Chem 2000;275:463 3-7), apoptosis (Kaye et al. FEBS Lett 2000;467:348-55), 
and nucleotide-excision repair (de Laat et al. Genes Dev 1999;13:768-785). UDPs family proteins 
have been shown directly associated with pathogenesis of several diseases including xeroderma 
pigmentosum for RAD23 (Masutani et al. EMBO J 1994;13:1831-43), and Parkinson's disease for 

30 parkin (Kitada et al. Nature 1998;392:605-8). In addition, involvement of ubiquitin-like proteins or 
abnormal ubiquitinated accumulation of proteins has been found in multiple human disorders. Most 
of them, but not all, involve nervous central system as Alzheimer's disease (van Leeuwen et al. 
Science 1998;279:242-7), diffuse Lewy body disease (Iseki et al. J Neurol Sci 1997;146:53-7), 
Huntington disease (Scherzinger et al. Cell 1997;90:549-58), and amyotrophic lateral sclerosis 

35 (Leigh et al. Brain 1991 ;1 14:775-88). In most disorders, ubiquinated-proteins accumulate within 
cells and form aggregates termed inclusion bodies that have characteristic appearance on 
histological examination. In addition, abnormal accumulation of ubiquitinated proteins has been 
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found in Von-Hippel Lindau disease (Kamura et al. Proc Natl Acad Sci USA. 2000;97:10430-5), 
and in liver of alcoholic hepatitis patients (Ohta et al. Lab Invest. 1988;59:848-56). Components of 
hepatocytes are released within the circulation in alcoholic hepatitis (Sorbi et al. Am J Gastroenterol 
1999;94:1018-22) 

5 It is believed that the protein of SEQ ID NO: 265 or part thereof plays a role in the 

regulation of proteolysis, preferably as a ubiquitin-like protein, more preferably as a ubiquitin- 
domain protein. In addition, the protein of the invention may play a role in protein folding, 
apoptosis and nucleotide-excision repair. Preferred polypeptides of the invention are polypeptides 
comprising the amino acids of SEQ ID NO: 265 from positions 1 to 82. Other preferred 

1 0 polypeptides of the invention are fragments of SEQ ID NO: 265 having any of the biological 
activity described herein. 

In an embodiment, the invention relates to compositions and methods using the protein of 
the invention or part thereof to remove, identify or inhibit contaminating proteases in a sample. 
Compositions comprising the polypeptides of the present invention may be added to biological 

15 samples as a "cocktail" with other protease inhibitors to prevent degradation of protein samples. 
The advantage of using a cocktail of protease inhibitors is that one is able to inhibit a wide range of 
proteases without knowing the specificity of any of the proteases. Using a cocktail of protease 
inhibitors also protects a protein sample from a wide range of future unknown proteases which may 
contaminate a protein sample from a vast number of sources. Such protease inhibitor cocktails (see 

20 for example the ready to use cocktails sold by Sigma) are widely used in research laboratory assays 
to inhibit proteases susceptible of degrading a protein of interest for which the assay is to be 
performed. For example, the protein of the invention or part thereof is added to samples where 
proteolytic degradation by contaminating proteases is undesirable. Alternatively, the protein of the 
invention or part thereof may be bound to a chromatographic support, either alone or in 

25 combination with other protease inhibitors, using techniques well known in the art, to form an 
affinity chromatography column. A sample containing the undesirable protease is run through the 
column to remove the protease. Alternatively, the same methods may be used to identify new 
proteases. 

Another embodiment of the invention relates to compositions and methods of using the 
30 protein of invention or part thereof to develop assays for the immunohistochemical detection of 
testicular malignant tissue, as the protein is overexpressed in such tissue. For instance, this could be 
used for staging lymph node testicular cancer dissemination using the techniques and methods 
detailed in Nazeer et al. Oncol Rep (1998);5: 1425-9. The ability to specifically visualize malignant 
tissues (and cells derived from the tissues), is useful for numerous applications, including to 
35 determine the origin, to identity e.g. cancerous cells, as well as to facilitate the identification of 
particular cells and tissues for, e.g. the evaluation of histological slides. 
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In another embodiment, the invention relates to compositions or methods using the protein 
of SEQ ID NO: 265 or part thereof to diagnose, treat and/or prevent disorders including, but not 
limited to xeroderma pigmentosum, Von-Hippel Lindau disease, alcoholic hepatitis, in 
neurodegenrative diseases such as Alzheimer's disease, diffuse Lewy body disease, Huntington 
5 disease, and amyotrophic lateral sclerosis. Detection of poly-ubiquinated protein conjugates in 
biological samples, such as brain tissues for the diagnosis of neurodegenerative disorders or liver 
and serum or plasma for the diagnosis of alcoholic hepatitis, may be performed using antibodies or 
nucleic acid able to detect the expression of the protein of the invention using 
immunohistochemisty, enzyme-linked immunosorbant assay (ELISA) or any other technique 

10 known to those skilled in the art including Northern blotting, RT-PCR or immunoblotting methods 
described herein as well as the technique described in Mimnaugh et al. Electrophoresis 
1999;20:418-28. The expression of the protein of the invention in patients' samples is then 
compared to the expression in control individuals. 

In still another embodiment, the invention relates to compositions or methods to treat, 

1 5 attenuate and/or prevent disorders including, but not limited to xeroderma pigmentosum, Von- 
Hippel Lindau disease, alcoholic hepatitis, in neurodegenerative diseases such as Alzheimer's 
disease, diffuse Lewy body disease, Huntington disease, and amyotrophic lateral sclerosis using the 
protein of the invention, part thereof, or any other compounds developed using the present protein 
as nucleic acids, antibodies, or chemical substances. In a preferred embodiment, proteins or other 

20 compounds targeted against the protein of invention or part thereof may be used to treat, prevent 
and/or attenuate disorders in which ubiquitin-like proteins or abnormal accumulation of 
ubquitininated proteins has been found and can be involved in pathogenesis of the disease. For 
instance, proteins or other compounds targeted against protein of SEQ ID NO: 265 can be 
administered to treat or attenuate symptoms of patients affected with Alzheimer's disorder or any 

25 other neurodegenerative disorders. 

Protein of SEQ ID NO: 408 (internal designation 174-8-2-0-C10-CS) 

The protein of SEQ ID NO: 408 encoded by the cDNA of SEQ ID NO: 167 found in 
salivary gland and brain is homologous to a drosophila melanogaster protein thought to be 
transmembraneous (STR: Q9V641). The 345-amino-acid-long protein of SEQ ID NO: 408 displays 

30 the Rhomboid pfam domain from positions 186 to 323 and is predicted having six transmembrane 
domains from positions 101 to 121, 167 to 187, 204 to 224, 243 to 263, 273 to 293, 298 to 318. 

Rhomboid genes were identified in flies and in organisms as diverse as Arabidopsis, yeast, 
bacteria, and mammals. Human and rat homologues of Rhomboid have been identified (Pascal et 
al.: 1998; FEBBS Lett. 429; 337-340). This very widespread conservation implies that the 

35 Rhomboid family proteins have a fundamental function within many cells. The Drosophila 
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Rhomboid has six transmembrane domains and an amino terminal hydrophobic region like the 
protein of the invention. 

The 355-amino-acid-long Drosophila Rhomboid protein is known to control many aspects 
of fly development and especially, to establish position along the dorsoventral axis and then again 
5 later to specify the fate of neuronal precursor cells. Rhomboid expression is sufficient to activate 
EGF receptor (EGFr) signaling in all tissues in Drosophila, while loss of Rhomboid mimics 
reduction (or loss) of EGFr signaling in almost all tissues (Guichard et al :.1999 Development; 126, 
2663-2676). As in mammals, the drosophila EGF receptor controls many aspects of growth and 
development. Three activating ligands of the drosophila EGFr have been described, the most 

10 developmentally significant being the TGF alpha-like molecule, Spitz (Rutledge et al. : 1992; Genes 
& Dev. 6; 1503-1517). None of the Rhomboid-like proteins from species other than Drosophila 
have clearly assigned functions. However, there is compelling genetic evidence from Drosophila 
that Rhomboid has a key role in intercellular signaling: it functions as an activator of the EGF 
receptor, probably by controlling the activation the TGF-like ligand Spitz (Guichard et al :.1999 

15 Development; 126, 2663-2676). Indeed, Rhomboid expression is the principal rate-limiting step in 
activation of the Ras/MAP kinase pathway by the EGFr. 

Like mammalian TGF alpha, Spitz is synthesized as a functionally inert transmembrane 
protein; subsequently, the proteolytic release of the extracellular portion of the molecule gives rise 
to a soluble and potent EGFr ligand (Golembo et al.: 1996; Development; 122; 3363-3370). Unlike 

20 all other essential components of EGFr signaling, the expression of Rhomboid is tightly restricted to 
sites of signaling activity. It has been proposed that Rhomboid attains its key role in the pathway by 
regulating the proteolytic cleavage of Spitz (Wasserman et al : 2000; Genes & development; 14; 
1651-1663). The preeminence of Rhomboid in a pathway as critical to development and growth 
control as the EGFr/Ras/Map kinase cascade provides a strong incentive to understand its molecular 

25 mechanism. By analogy to mammalian EGFr ligands that are similarly processed, Spitz cleavage is 
expected to be catalyzed by an ADAM like protease (Black et al 1998; Curr. Opin. Cell.Biol. 10; 
654-659), but Rhomboid resembles no known protease. 

The Drosophila eye has served as a useful model for studying mechanisms of EGFr and Ras 
signaling. At least five different roles for the receptor have been identified (for reviews see 

30 Wasserman et al : 2000; Genes & development; 14; 1651-1663), the best characterized being its 
function in recruiting cells into the developing ommatidiurn- the individual unit of the fly 
compound eye. Each ommatidiurn contains eight photoreceptors, four cone cells that secrete lens 
material, and an average of eight pigment cells. It has been shown that the fly EGFr has a role in 
regulating cell survival in the developing eye (Dominguez et al. 1998; Curr. Biol. 8; 1039-1048). 

35 The EGFr signaling pathway has been conserved between flies and vertebrates. The EGFr 

family consists of four members, HER1 (c-erbBl, EGFR), HER2 (c-erbB2), HER3 (c-erbB3), 
HER4 (c-erbB4), expressed in a wide range of cells (Gullick W.J. 1998; Br. Cancer Res. Treat.; 52, 
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43-53). TGF.alpha. and its homologs have been found to be the most abundant ligands for the 
EGF/TGF.alpha. receptor in most parts of the brain (Kaser, et al., (1992)Brain Res Mol Brain Res: 
16:316-322). There appears to be a widespread distribution of TGF.alpha. in various regions of the 
brain in contrast to EGF which is only present in smaller, more discrete areas, suggesting that TGF- 
5 alpha might play a physiological role in brain tissues. These numerous receptor sites for TGF.alpha 
in the brain suggest that TGF has an important utility in promoting normal brain cell differentiation 
and function. 

Transforming growth factor alpha (TGF-alpha.) is a relative of epidermal growth factor 
(EGF) and like EGF, it exerts its effects on cells through binding to the EGF receptor. The precise 

10 physiological roll of TGF.alpha. is still not clear, although it appears to be important in eye and hair 
follicle development and may play a role in both the immune system and in wound healing. (See 
Kumar, et al.; 1995 Cell Biology International, 19:5, 373-388). The EGF family receptors currently 
includes four EGF receptors. The EGFR2 receptor may also be referred to as ERB-2 and this 
molecule is useful for a variety of diagnostic and therapeutic indications (Prigent, S. A., and 

15 Lemoine, N. R., (1992) Prog Growth Factor Res., 4:1-24). The TGF-alpha is likely a ligand for one 
or more of these receptors as well as for yet an identified new EGF-type receptor. Use of the TGF- 
alpha. can assist with the identification, characterization and cloning of such receptors. For 
example, the EGF receptor gene represents the cellular homolog of the v-erb-B oncogene of avian 
erythroblastosis virus. Over expression of the EGF-receptor or deletion of kinase regulatory 

20 segments of the protein can bring about tumorigenic transformation of cells (Manjusri, D. et al., 
(1991) Human Cytokines, 364 and 381). 

The EGF receptor, and the related ErbB family of receptor tyrosine kinases, have indeed 
been much implicated in human cancer. It is commonly believed that hyperactive receptor 
signaling promotes dysregulates growth control and in involved in the onset of malignancy, as well 

25 as in the disruption of developmental programs. Very little, however, is known about ErbB 
physiological regulation in humans. The fruitfly, Drosophila melanogaster, has a single receptor 
homologous to the four ErbB receptors. As signaling mechanisms have been well conserved 
between flies and mammals, these results of experiments in flies are relevant to the study of the 
human receptors in development and disease. Two areas of recent progress are emphasized. First, a 

30 number of signal modulators have been identified, including three EGF receptor inhibitors, several 
of which have human homologues. Second, the signaling molecules are integrated into regulatory 
networks that specify the elaborate activation profiles needed in development (positive and negative 
feedback control of EGF receptor signaling emerges as a central theme). 

It is thus important to discover whether Rhomboid-like proteins also have functions similar 

35 to those observed in Drosophila in other higher organisms, including mammals, because of the 
substantial clinical importance of the EGFr pathway. 
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It is believed that the protein of SEQ ID NO: 408 or part thereof plays a role into the control 
of cellular signaling. Preferably the protein of the invention or part thereof plays a role in the 
activation of EGFr-mediated cell signaling, probably through the control of the activation of EGFr 
ligands, such as EGF, TGF alpha and TGF alpha-like factor, more probably through the proteolytic 
5 cleavage of such ligands. Preferred polypeptides of the invention are polypeptides comprising the 
amino acids of SEQ ID NO: 408 from positions 186 to 323, 101 to 121, 167 to 187, 204 to 224, 243 
to 263, 273 to 293, and 298 to 3 1 8. Other preferred polypeptides of the invention are fragments of 
SEQ ID NO: 408 having any of the biological activity described herein. The proteolytic activity of 
the protein of the invention or part thereof as well as its involvment in regulation of cellular 
10 signalling though the activation of EGFr may be assayed using any of the assays known to those 
skilled in the art. 

An embodiment of the invention relates to composition and methods using the protein of 
the invention or part thereof to identify and/or quantify the activation of EGF receptors, preferably 
veretebrate EGF receptors, more preferably human ErbB receptors, in a biological sample, and thus 

15 used in assays and diagnostic kits for the quantification of such activation in bodily fluids, in tissue 
samples, and in mammalian cell cultures. The assessment of the activation of EGF receptors may 
be perfomed using any assay familiar to those skilled in the art. Preferably, a defined quantity of 
the protein of the invention or part thereof is added to the sample under conditions allowing the 
activation of EGFr. Then, the activation of EGFr is assayed and eventually compared to a control 

20 using any of the techniques known by those skilled in the art. 

The present invention also relates to diagnostic assays for detecting altered levels of the 
protein of the present invention in various tissues since an over-expression of the proteins compared 
to normal control tissue samples can detect the presence of certain disease conditions such as 
neoplasia, skin disorders, ocular disorders and inflammation. Assays used to detect levels of the 

25 polypeptide of the present invention in a sample derived from a host are well-known to those of 
skill in the art and include radioimmunoassays competitive-binding assays, Western Blot analysis 
and preferably ELISA assays. 

This invention is also related to the use of SEQ ID No: 167 or its complement as a 
diagnostic tool. Detection of a mutated form of the nucleotide sequence of SEQ ID No: 167 of the 

30 present invention will allow a diagnosis of a disease or a susceptibility to a disease which results 
from underexpression of the polypeptide of the present invention for example, improper wound 
healing, improper neurological functioning, ocular disorders, kidney and liver disorders, hair 
follicular development, angiogenesis and embryogenesis Individuals carrying mutations in the 
human nucleotide sequence of SEQ ID No: 167 of the present invention may be detected at the 

35 DNA level by a variety of techniques. Nucleic acids for diagnosis may be obtained from a patient's 
cells, such as from blood, urine, saliva, tissue biopsy and autopsy material. The genomic DNA may 
be used directly for detection or may be amplified enzymatically by using PCR (Saiki et al., (1986) 
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Nature, 324:163-166) prior to analysis. RNA or cDNA may also be used for the same purpose. As 
an example, PCR primers complementary to the nucleic acid encoding a polypeptide of the present 
invention can be used to identify and analyze mutations thereof. For example, deletions and 
insertions can be detected by a change in size of the amplified product in comparison to the normal 
5 genotype. Point mutations can be identified by hybridizing amplified DNA to radiolabeled RNA or 
alternatively, radiolabeled antisense DNA sequences. Perfectly matched sequences can be 
distinguished from mismatched duplexes by RNase A digestion or by differences in melting 
temperatures. 

In another embodiment, the protein of the invention or part thereof can be used to diagnose, 
10 treat and/or prevent disorders linked to dysregulation of growth control, such as cancer and other 
disorders relating to abnormal cellular differentiation, proliferation, or degeneration, including 
hyperaldosteronism, hypocortisolism (Addison's disease), hyperthyroidism (Grave's disease), 
hypothyroidism, colorectal polyps, gastritis, gastric and duodenal ulcers, ulcerative colitis, and 
Crohn's disease, neurodegenrative disroders such as Parkinson's and Alzheimer's diseases using 
15 any methods and/or techniques described herein. For diagnostic purposes, the expression of the 
protein of the invention could be investigated using any of the Northern blotting, RT-PCR or 
immunoblotting methods described herein and compared to the expression in control individuals. In 
addition, the protein of the invention or part thereof may be used to evaluate the disease progression 
and the clinical treatment efficiency. Inhibition of expression of the protein of the invetion or part 
20 thereof to inhibit EGFr activation could be achieved by many means known to those skilled in the 
art including those described in the present application such as antisense nucleotide or triple helix 
strategies. 

Protein ofSEQIDNO:291 (internal designation: 180-19-4-0-F4-CS) 

The protein of SEQ ID No:291 encoded by the cDNA of SEQ ID No:50 is homologous to 

25 proteins of the tissue inhibitor of metalloproteinases (TEMP) family. The protein of the invention 
(207 amino-acids) is highly homologous to and appears to be a variant of the metalloproteinase 
inhibitor 1 precursor (TLMP-l, 207 amino-acids) human protein (SwissProt P01033). The protein 
of the invention is stronly expressed in the liver, ovary and testis. 

There are many different types of collagen found in the body and they, together with other 

30 extracellular matrix components, such as elastin, gelatin, proteoglycan and fibronectin, make up a 
large proportion of the body's extracellular tissue. Matrix metalloproteinases (MMPs) are enzymes 
that are involved in the degradation and denaturation of extracellular matrix components. 
Collagenases, for example, are MMPs that degrade or denature collagen. A large number of 
different collagenases are known to exist. These include interstitial collagenases, type IV-specific 

35 collagenases and collagenolytic proteinases. Collagenases are generally specific for collagens 

which, in their full triple helix structure, are extremely resistant to other enzymes. Other MMPs are 
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involved in the degradation and denaturation of different extracellular matrix components, for 
example, elastin, gelatin and proteoglycan. Some MMPs are able to degrade or denature several 
different types of collagen and also other extracellular matrix components. For example, 
stromelysin degrades type IV collagen, which is found in basement membrane, and also has an 
5 effect on other extracellular matrix components such as elastin, fibronectin and cartilage 

proteoglycans. The ability of MMPs metalloproteinases (such as collagenase, stromelysin, and 
gelatinase) to degrade various components of connective tissue makes them potential targets for 
controlling numerous pathological processes. 

The presence of tissue inhibitors of MMPs has been observed in a variety of explants and in 

10 monolayer cultures of mammalian connective tissue cells (Vater et al 1979 and Stricklin and Wegus 
1983). Not only collagenase inhibitors but also inhibitors for other MMPs, for example, gelatinase 
and proteoglycanase have been found. MMP inhibitors are generally unable to bind the inactive 
(zymogen) forms of the respective enzymes but complex readily with active forms (Murphy et al 
1981). Tissue MMP inhibitors are found, for example, in dermal fibroblasts, human lung, gingival, 

15 tendon and corneal fibroblasts, human osteoblasts, uterine smooth muscle cells, alveolar 

macrophages, amniotic fluid, plasma, serum and the .alpha. -granule of human platelets (Stricklin 
and Wegus 1983; Welgus et al 1985; Welgus and Stricklin 1983; Bar-Sharvit et al 1985; Wooley et 
al; 1976; and Cooper et al 1985). 

The protein of the invention is a secreted TIMP-1 protein which tightly complexes with 

20 metalloproteinases and irreversibly inactivate them. TIMP-1 has been identified as a secretory 
product of platelets and alveolar macrophages 

Thus, an embodiment of the present invention relates to the use of the protein of the 
invention or a fragment thereof to inhibit the action of MMPs by directly inhibiting the enzyme 
activity like a conventional inhibitor. The inhibitory activity of a MMP inhibitor may be assessed by 

25 any method suitable for determining inhibitory activity of a compound with respect to an enzyme. 
Such methods are described in standard textbooks of biochemistry. 

In one embodiment, the protein of SEQ ID NO:291 can be used to treat and diagnose 
disorders associated with excessive MMP expression, such as inflammatory disorders such as 
rheumatoid arthritis, osteoarthritis, osteopenias such as osteoporosis, pulmonary emphysema, 

30 periodontitis, gingivitis, corneal epidermal or gastric ulceration, and tumour metastasis, invasion 
and growth, Paget's disease, hyperparathyroidism. MMP inhibitors are also of potential value in the 
treatment of neuroinflammatory disorders, including those involving myelin degradation, for 
example multiple sclerosis, as well as in the management of angiogenesis dependent diseases, 
which include arthritic conditions and solid tumour growth as well as psoriasis, proliferative 

35 retinopathies, neovascular glaucoma, ocular tumours, angiofibromas and hemangiomas. The 
present invention relates to a method of treating diseases in which MMPs are involved such as 
atherosclerotic plaque rupture, restenosis, aortic aneurysm (including abdominal aortic aneurysm 
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and brain aortic aneurysm), congestive heart failure, left ventricular dilatation, myocardial 
infarction, decubital ulcers, chronic ulcers or wounds, renal disease, or other autoimmune or 
inflammatory diseases dependent upon tissue invasion by leukocytes, Crohn's disease, acute 
respiratory distress syndrome, asthma, chronic obstructive pulmonary disease, Alzheimer's disease, 
5 organ transplant toxicity, cachexia, allergic reactions, allergic contact hypersensitivity, 

epidermolysis bullosa, loosening of artificial joint implants, stroke, cerebral ischemia, head trauma, 
spinal cord injury, neurodegenerative disorders (acute and chronic), Huntington's disease, 
Parkinson's disease, migraine, depression, peripheral neuropathy, pain, cerebral amyloid 
angiopathy, nootropic or cognition enhancement, amyotrophic lateral sclerosis, ocular angiogenesis, 
10 macular degeneration, abnormal wound healing, burns, diabetes, scleritis, AIDS, sepsis, septic 
shock. 

In another embodiment, the protein of SEQ ED NO:291 has potential value in the treatment 
or diagnosis of atherosclerosis. The rupture of atherosclerotic plaques is the most common event 
initiating coronary thrombosis. Destabilization and degradation of the extracellular matrix 

15 surrounding these plaques by MMPs has been proposed as a cause of plaque fissuring. The 
shoulders and regions of foam cell accumulation in human atherosclerotic plaques show locally 
increased expression of gelatinase B, stromelysin-1, and interstitial collagenase. In situ 
zymography of this tissue revealed increased gelatinolytic and caseinolytic activity (Galla, et al., J. 
Clin. Invest., 1994;94:2494-2503). In addition, high levels of stromelysin RNA message have been 

20 found to be localized to individual cells in atherosclerotic plaques removed from heart transplant 
patients at the time of surgery (Henney, et al., Proc. Nafl. Acad. Sci., 1991;88:8154-8158). 

In another embodiment, the protein of the invention has utility in treating or detecting 
degenerative aortic disease associated with thinning of the medial aortic wall. Increased levels of 
the proteolytic activities of MMPs have been identified in patients with aortic aneurysms and aortic 

25 stenosis (Vine N. and Powell J. T., Clin. Sci., 1991;81:233-239). 

In another embodiment, the protein of the invention can be used as a treatment or diagnostic 
tool for heart failure and associated ventricular dilatation. Heart failure arises from a variety of 
diverse etiologies, but a common characteristic is cardiac dilation which has been identified as an 
independent risk factor for mortality (Lee, et al., Am. J. Cardiol., 1993;72:672-676). This 

30 remodeling of the failing heart appears to involve the breakdown of extracellular matrix. MMPs are 
increased in patients with both idiopathic and ischemic heart failure (Reddy, et al., Clin. Res., 
1993;41:660A; Tyagi S. C, et ah, Clin. Res., 1993;41:681A). Animal models of heart failure have 
shown that the induction of gelatinase is important in cardiac dilation (Armstrong, et al., Can. J. 
Cardiol., 1994;10:214-220), and cardiac dilation precedes profound deficits in cardiac function 

35 (Sabbah, et al., Am. J. Physiol., 1992;263:H266-H270). 

In another embodiment, the protein of the invention is useful in treating or detecting 
neointimal proliferation, leading to restenosis, frequently developed after coronary angioplasty. 
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The migration of vascular smooth muscle cells (VSMCs) from the tunica media to the neointima is 
a key event in the development and progression of many vascular diseases and a highly predictable 
consequence of mechanical injury to the blood vessel (Bendeck M. P., et al., Circulation Research, 
1994;75:539-545). Northern blotting and zymographic analyses indicated that gelatinase A was the 
5 principal MMP expressed and excreted by these cells. Further, antisera capable of selectively 
neutralizing gelatinase A activity also inhibited VSMC migration across basement membrane 
barrier. (Pauly R. R., et al., Circulation Research, 1994;75:41-54). 

In another embodiment, the protein of the invention is used to ensure normal kidney 
function, which is dependent on the maintenance of tissues constructed from differentiated and 

10 highly specialized renal cells. Those cells are in a dynamic balance with their surrounding 

extracellular matrix (ECM) components (Davies M. et al., Kidney Int., 1992;41:671-678). Effective 
glomerular filtration requires that a semi -permeable glomerular basement membrane (GBM) 
composed of collagens, fibronectin, enactin, larninin and proteoglycans is maintained. A structural 
equilibrium is achieved by balancing the continued deposition of ECM proteins with their 

15 degradation by specific MMPs. These proteins are first secreted as proenzymes and are 

subsequently activated in the extracellular space. These proteinases are in turn subject to counter 
balancing regulation of their activity by naturally occurring inhibitors as TIMPs. 

Deficiency or defects in any component of the filtration barrier may have catastrophic 
consequences for longer term renal function. For example, in hereditary nephritis of Alports type, 

20 associated with mutations in genes encoding ECM proteins, defects in collagen assembly lead to 
progressive renal failure associated with splitting of the GBM and eventual glomerular and 
interstitial fibrosis. In contrast, in inflammatory renal diseases such as glomerulonephritis, cellular 
proliferation of components of the glomerulus often precede obvious ultrastructural alteration of the 
ECM matrix. Cytokines and growth factors implicated in proliferative glomerulonephritis such as 

25 interleukin-1, tumor necrosis factor, and transforming growth factor beta can upregulate 

metalloproteinase expression in renal mesangial cells (Martin J. et al., J. Immunol., 1986; 137:525- 
529; Marti H. P. et al., Biochem. J., 1993;291:441-446; Marti H. P. et al., Am. J. Pathol., 
1994;144:82-94). These metalloproteinases are believed to be intimately involved in the aberrant 
tissue remodeling and cell proliferation characteristic of renal diseases, such as, IgA nephropathy 

30 which can progress to through a process of gradual glomerular fibrosis and loss of functional GBM 
to end-stage renal disease. Metalloproteinase expression has already been well-characterized in 
experimental immune complex-mediated glomerulonephritis such as the anti-Thy 1.1 rat model 
(Bagchus W. M., et al., Lab. Invest., 1986;55:680-687; Lovett D. H., et al., Am. J. Pathol., 
1992;141:85-98). 

35 In another embodiment, the protein of the invention can be used as a treatment or diagnostic 

tool for gingiva. Collagenase and stromelysin activities have been demonstrated in fibroblasts 
isolated from inflamed gingiva (Uitto V. J., et al., J. Periodontal Res., 1981;16:417-424), and 
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enzyme levels have been correlated to the severity of gum disease (Overall C M., et al., J. 
Periodontal Res., 1987;22:81-88). 

In another embodiment, the protein of the invention is useful for treating or detecting ulcers. 
Proteolytic degradation of extracellular matrix has been observed in corneal ulceration following 
5 alkali burns (Brown S. I., et al., Arch. Opthalmol., 1969;81:370-373). Thiol-containing peptides 
inhibit the collagenase isolated from alkali-burned rabbit corneas (Burns F. R., et al., Invest. 
Opththamol., 1989;30:1569-1575). Stromelysin, a member of the MMP family, is produced by 
basal keratinocytes in a variety of chronic ulcers (Saarialho-Kere U. K., et al., J. Clin. Invest., 
1994;94:79-88). Stromelysin-1 mRNA and protein were detected in basal keratinocytes adjacent to 

10 but distal from the wound edge in what probably represents the sites of proliferating epidermis. 
Stromelysin-1 may thus prevent the epidermis from healing. 

In another embodiment, the protein of the invention can be used as a treatment or diagnostic 
tool for tumor angiogenesis. Inhibitors of MMPs have shown activity in models of tumor 
angiogenesis (Taraboletti G., et al., Journal of the National Cancer Institute, 1995;87:293; and 

15 Benelli R., et al., Oncology Research, 1994;6:251-257). Davies et al., (Cancer Res., 1993;53:2087- 
2091) reported that a peptide decreased the tumor burden and prolonged the survival of mice 
bearing human ovarian carcinoma xenografts. A peptide of the conserved MMP propeptide 
sequence was a weak inhibitor of gelatinase A and inhibited human tumor cell invasion through a 
layer of reconstituted basement membrane (Melchiori A., et al., Cancer Res., 1992;52:2353-2356), 

20 and the natural tissue inhibitor of metalloproteinase-2 (TIMP-2) also showed blockage of tumor cell 
invasion in in vitro models (DeClerck Y. A., et al., Cancer Res., 1992;52:701-708). Studies: of 
human cancers have shown that gelatinase A is activated on the invasive tumor cell surface 
(Strongin A. Y., et al., J. Biol. Chem., 1993;268:14033-14039) and is retained there through 
interaction with a receptor-like molecule (Monsky W. L, et al., Cancer Res., 1993;53:3159-3164). 

25 In another embodiment, the protein of the invention can be used to treat and diagnose 

rheumatoid arthritis. Collagenases have been implicated in a number of diseases, including, 
rheumatoid arthritis (Mullins, D. E. et al 1983), and it has been proposed to use MMP inhibitors in 
the treatment of this condition. Several investigators have demonstrated consistent elevation of 
stromelysin and collagenase in synovial fluids from rheumatoid and osteoarthritis patients as 

30 compared to controls (Walakovits L. A., et al., Arthritis Rheum., 1992;35:35-42; Zafarullah M., et 
al., J. Rheumatol., 1993;20:693-697). TIMP-1 and TEMP-2 prevented the formation of collagen 
fragments, but not proteoglycan fragments, from the degradation of both the bovine nasal and pig 
articular cartilage models for arthritis, while a synthetic peptide hydroxamate could prevent the 
formation of both fragments (Andrews H. J., et al., Biochem. Biophys. Res. Commun., 

35 1994;201:94-101). 

In another embodiment, the protein of the invention is used to treat or diagnose 
inflammation. Gijbels et al., (J. Clin. Invest., 1994;94:2177-2182) recently described a peptide that 
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suppressed the development or reversed the clinical expression of experimental allergic 
encephalomyelitis (EAE) in a dose dependent manner, suggesting the use of MMP inhibitors in the 
treatment of autoimmune inflammatory disorders such as multiple sclerosis. A recent study by 
Madri has elucidated the role of gelatinase A in the extravasation of T-cells from the blood stream 
5 during inflammation (Ramanic A. M. and Madri J. A., J. Cell Biology, 1994;125:1 165-1 178). This 
transmigration past the endothelial cell layer is coordinated with the induction of gelatinase A and is 
mediated by binding to the vascular cell adhesion molecule- 1 (VCAM-1). Once the barrier is 
compromised, edema and inflammation are produced in the CNS. Leukocytic migration across the 
blood-brain barrier is known to be associated with the inflammatory response in EAE. Inhibition of 

10 the metalloproteinase gelatinase A would block the degradation of extracellular matrix by activated 
T-cells that is necessary for CNS penetration. These studies provided the basis for the belief that an 
inhibitor of stromelysin-1 and/or gelatinase A will treat diseases involving disruption of 
extracellular matrix resulting in inflammation due to lymphocytic infiltration, inappropriate 
migration of metastatic or activated cells, or loss of structural integrity necessary for organ function. 

15 The present invention provides the use of an MMP inhibitor in the manufacture of a 

medicament for the treatment or prophylaxis of scars. Collagen is the major component of scar and 
other contracted tissue and as such is the most important structural component to consider. 
Contraction of tissues comprising extracellular matrix components, especially of collagen- 
comprising tissues, may occur in connection with many different pathological conditions and with 

20 surgical or cosmetic procedures. Contracture, for example, of scars, may cause physical problems, 
which may lead to the need for medical treatment, or it may cause problems of a purely cosmetic 
nature. 

During experiments on in vitro models of scar contraction, collagen appears to be invaded 
and permanently remodelled by fibroblasts and that such invasion and remodelling is inhibited by 
25 collagenase inhibitors. The remodelling generally appears as contraction of the collagen, the 
contraction of which is inhibited by inhibition of collagenase. Furthermore, inhibition of other 
MMPs also results in inhibition of contraction. 

The present invention also provides the use of an MMP inhibitor in the treatment or 
prophylaxis of a natural or artificial tissue comprising extracellular matrix components to inhibit, 
30 i.e. restrict, hinder or prevent, contraction of the tissue, especially contraction resulting from a 
pathological condition or from surgical or cosmetic treatment. 

Cosmetic treatments, such as chemical or physical dermal abrasion, used as anti-ageing 
treatments, cause trauma to the skin. Use of MMP inhibitors during the healing process which 
occurs after the initial abrasion is a cosmetic use of MMP inhibitors according to the present 
35 invention. 

The present invention also provides the use of an MMP inhibitor to inhibit, i.e. restrict, 
hinder or prevent, invasion by cells, especially fibroblasts, into tissue comprising an extracellular 
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matrix and/or migration by cells, especially fibroblasts, in or through tissue comprising an 
extracellular matrix. 

In another embodiment, the present protein is used to prevent or reduce contracture of scar 
tissue resulting from eye surgery. Glaucoma surgery to create new drainage channels often fails 
5 due to scarring and contraction of tissues. A method of preventing contraction of scar tissue formed 
in the eye, such as the application of a suitable agent, is therefore invaluable. Such an agent may 
also be used in the control of the contraction of scar tissue formed after corneal trauma or corneal 
surgery, for example laser or surgical treatment for myopia or refractive error in which contraction 
of tissues may lead to inaccurate results. It is also useful in cases where scar tissue is formed on/in 

10 the vitreous humor or the retina, for example, that which eventually causes blindness in some 

diabetics and that which is formed after detachment surgery, called proliferative vitreoretinopathy. 
Other uses include where scar tissue is formed in the orbit or on eye and eyelid muscles after squint, 
orbital or eyelid surgery, or thyroid eye disease and where scarring of the conjunctiva occurs as may 
happen after glaucoma surgery or in cicatricial disease, inflammatory disease, for example, 

15 pemphigoid, or infective disease, for example, trachoma. A further eye problem associated with the 
contraction of collagen-comprising tissues for which the methods and medicaments of the present 
invention may be used is the opacification and contracture of the lens capsule after cataract 
extraction. 

In a preferred embodiment, the protein of the invention can be used for the treatment of 
20 burns. Contraction of collagen-comprising tissue, which may also comprise other extracellular 
matrix components, frequently occurs in the healing of burns. The burns may be chemical, thermal 
or radiation burns and may be of the eye, the surface of the skin or the skin and the underlying 
tissues. It may also be the case that there are burns on internal tissues, for example, caused by 
radiation treatment. 

25 A further aspect of the present invention is the inhibition of the contraction of skin grafts. 

Skin grafts may be applied for a variety of reasons and may often undergo contraction after 
application. As with the healing of burnt tissues the contraction may lead to both physical and 
cosmetic problems. It is a particularly serious problem where many skin grafts are needed as, for 
example, in a serious burns case. 

30 An associated area in which the medicaments and methods of the present invention are of 

great use is in the production of artificial skin. To make a true artificial skin it is necessary to have 
an epidermis made of epithelial cells (keratinocytes) and a dermis made of collagen populated with 
fibroblasts. It is important to have both types of cells because they signal and stimulate each other 
using growth factors. A major problem up until now has been that the collagen component of the 

35 artificial skin often contracts to less than one tenth of its original area when populated by 
fibroblasts. MMP inhibitors, for example, collagenase inhibitors may be used to inhibit the 
contraction to such an extent that the artificial skin can be maintained at a practical size. 
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Protein ofSEQ ID NO:276 (157-15-4-0-B1 1-CS) 

The protein of SEQ ID NO:276, encoded by the cDNA of SEQ ED NO:35, is a variant of a 
testis-specific isoform of human calpastatin protein (Genseq accession number W 19395). The 
protein of SEQ ID NO:276 contains 2 potential transmembrane segments (position 5 to 25 and 
5 position 109 to 129) predicted by the software TopPred II (Claros and von Heijne, CABIOS applic. 
Notes, 10 :685-686 (1994)), and a signal peptide (position 8 : LAVILTLLGLAIL/AI). Like the 
human calpastatin protein (Genseq accession number W19395), the protein of SEQ ID NO:276 is 
over-represented in testis. 

Calpastatin is a physiological inhibitor of calpains. Calpains, a group of ubiquitous Ca2+ - 

10 activated cytosolic proteases, are thought to participate in cytoskeletal remodeling events, cellular 
adhesion, shape change, and mobility by the site-specific regulatory proteolysis of membrane- and 
actin-associated cytoskeletal proteins (Beckerle et al., Cell 51 :569-577, 1987; Yao et al., Am. J. 
Physiol. 265(pt. l):C36-46, 1993; and Shuster et al., J. Cell Biol. 128:837-848, 1995). Calpains 
have also been implicated in the pathophysiology of cerebral and myocardial ischemia, platelet 

15 activation, NF-kB activation, Alzheimer's disease, muscular dystrophy, cataract progression and 
rheumatoid arthritis. There is considerable interest in inhibitors of calpain, as cellular adhesion, 
cytoskeletal remodeling events and cell mobility are linked to numerous pathologies (Wang et al., 
Trends in Pharm. Sci. 15:412-419, 1994; Mehdi, Trends in Biochem. Sci. 16:150-153, 1991). In 
addition, as the calpain/calpastatin system is involved in membrane fusion events for several cell 

20 types, and calpain can be detected in human sperm and testes extracts by Western blotting with 
specific antisera, tCAST may modulate calpain in the calcium-mediated acrosome reaction that is 
required for fertilization (Li S et al., Biol Reprod, 63(1): 172-8, 2000). 

Calpastatin consists of a unique N-terminal domain (domain L) and four repetitive protease- 
inhibitor domains (domains 1-4) (Lee WJ et al., J Biol Chem, 267(12):8437-42, 1992). The isolated 

25 cDNAs from various mammalian species have conspicuous differences in the regions encoding the 
N-terminal sequences and can be classified into four types. Alternative splicing is most likely the 
cause for the molecular diversity, and the multiple isofonns are implicated in specific physiological 
roles (Lee WJ et al., J Biol Chem, 267(12):8437-42, 1 992). Type IV (or human tCAST), a shorter 
isoform, is specifically expressed in testis (Takano J et al., J Biochem Tokyo; 128(l):83-92, 2000). 

30 Human tCAST consists of a 40-amino-acid N-terminal T domain plus a part of domain II and all of 
domains III and IV from the somatic isoform. The protein of SEQ ID NO:276 shows extensive 
homology to the N-terminal region of the testis basic specific protein (U60665) and the human 
calpastatin protein (W19395). The homologous region corresponds to domain T and II of the 
human calpastatin protein (W 19395). The T domain targets cytosolic localization and membrane 

35 association of tCAST, whereas domain I of somatic calpastatin proteins (sCAST) exhibits a nuclear 
localization function (Li S et al., Biol Reprod, 63(1): 1 72-8, 2000). 
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It is believed that the protein of SEQ ID NO:276 is a member of the calpastatin family and, 
as such, plays a role in cytoskeletal remodeling events, cellular adhesion, shape change, and 
mobility by the site-specific regulatory proteolysis of membrane- and actin-associated cytoskeletal 
proteins. Preferred polypeptides of the invention are polypeptides comprising the amino acids of 
5 SEQ ID NO:276 from positions 1 to 1 19. Other preferred polypeptides of the invention are any 
fragments of SEQ ID NO:276 having any of the biological activities described herein. 

One embodiment of the present invention relates to methods of using the protein of the 
invention or part thereof in assays to detect the presence of calpain in a biological sample, such as in 
bodily fluids, in tissue samples, or in mammalian cell cultures. As calpastatin can bind calpain 

10 (Murachi, Biochemistry Int., 18(2)263-294, 1989), the protein of the invention can be used in assays 
and diagnostic kits to test the presence of calpain using techniques known to those skilled in the art. 
Preferably, a defined quantity of the protein of the invention or part thereof is added to the sample 
under conditions allowing the formation of a complex between the protein of the invention or part 
thereof, and the presence of the complex and/or the free protein of the invention or part thereof is 

15 assayed and compared to a control. Calpastatin has been shown to be useful as a marker of 
intracellular calpain activation, and can be used for monitoring the involvement of calpain in 
pathological situations (De Tullio et al., FEBS letter, 475(1):17-21, 2000). Calpain has been 
implicated in cytoskeletal protein degradation involved in the pathophysiology of ischemia and 
disorders like Alzheimer's disease (Wronski et al., J. Neural transm., 107(2): 145-1 57, 2000), 

20 apoptosis in neural cells of rat with spinal cord injury (SCI) (Ray, Brain res., 867(l-2):80-9, 2000), 
cell fusibility (Kosower et al., Methods Mol Biol., 144:181-94, 2000) and other physiopathologies. 
Assays detecting any increased calpain level in a cell would thus allow the diagnosis of any of the 
herein-described diseases or conditions. In addition, a recent study showed that in addition to their 
proteolytic activities on cytoskeletal proteins and other cellular regulatory proteins, calpain- 

25 calpastatin systems can also affect expression levels of genes encoding structural or regulatory 
proteins (Chen et al., Am. J. Physiol. Cell Physiol, 279:C709-C716, 2000). Thus, the ability to 
detect calpastatin and calpain levels will likely be useful for the diagnosis of an even larger number 
of diseases and conditions. 

In another embodiment, the polynucleotides or polypeptides of the invention may be used 

30 for the detection of gametes, or of specific structures within the gametes, using any technique 
known to those skilled in the art, including those involving the use of specific antibodies and 
nucleic acid probes. Various studies have shown that calpastatin is present in the sperm acrosome 
(Li et al., Bio. of Reprod., 63:172-178, 2000), and more precisely between the plasma membrane 
and outer acrosomal membrane of cynomolgus macaque spermatozoa (Yudin Al, J Androl, 

35 21(5):721-9, 2000). The ability to visualize spermatozoa generally, or the sperm acrosome in 

particular, has obvious utility for a number of applications, including for the analysis of infertility in 
patients, as described below. 
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Another embodiment of the present invention relates to a method of inhibiting a calpain in a 
cell. Various studies have shown that it is possible to inhibit calpains dose dependently in cell free 
protease activity assays: the calpain inhibitor Cerebrolysin can protect microtubule associated 
protein 2 (MAP2) in a rat model of acute brain ischemia (Wronski et al., J. Neural Transm. Suppl., 
5 59:263-272, 2000), and E-64-D, a cell permeable and selective inhibitor of calpain, can attenuate 
calpain activity associated with apoptosis in rat SCI (Ray et al., Brain Res., 867(1-2)80-9, 2000). 
Similarly, it is believed that the protein of SEQ ID NO:276 can be used to inhibit calpain in vitro or 
in vivo. As calpain has been implicated in a number of pathological processes, diseases, and 
conditions, such as the pathophysiology of cerebral and myocardial ischemia, platelet activation, 

10 NF-kB activation, Alzheimer's disease, muscular dystrophy, cataract progression and rheumatoid 
arthritis, any of these diseases or conditions can be treated or prevented by increasing or decreasing 
the activity or expression of the present protein in cells of a mammal affected by the disease or 
condition. Such an increase can be effected in any of a number of ways, including, but not limited 
to introducing a polynucleotide encoding the protein of the invention, operably linked to a 

15 promoter, into a cell ; and by administering to a cell a compound that increases the activity or 

expression of the protein of the invention. In addition, the expression or activation of the protein of 
the invention can be inhibited in any of a large number of ways, including using antisense 
oligonucleotides, antibodies, dominant negative forms of the protein, and using heterologous 
compounds that decrease the expression or activation of the protein. Such compounds can be 

20 readily identified, e.g. by screening candidate compounds and detecting the level of expression or 
activity of the protein using any standard assay. 

In another preferred embodiment, the protein of the invention can be used to modulate and/or 
characterize fertility, including for the treatment or diagnosis of infertility, and for contraception. 
As the calpain/calpastatin system has been implicated in the acrosomal reaction which is a required 

25 step in fertilization, it is likely that the over- or under-expression or activation of the present protein 
disrupts this reaction, thereby inhibiting fertility. Thus, the cause of infertility in many patients can 
likely be detected by detecting the level of expression of the present protein, where an abnormal 
level of activity or expression of the protein indicates that a cause of infertility involves the calpain- 
dependent acrosomal reaction. Such a diagnosis would also point to methods of treating the 

30 infertility, e.g. by increasing or decreasing the expression or activation of the protein in 

spermatozoa. Alternatively, for contraception, the expression or activation of the protein can be 
artificially disrupted, for example by increasing the protein level using polynucleotides encoding the 
protein, using the protein itself, or using activators of protein expression or activity, or by 
decreasing the protein level using inhibitors such as antisense oligonucleotides, antibodies, 

35 dominant negative forms of the protein, and using heterologous compounds that inhibit protein 
expression or activity. 
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Protein of SEP ID NO: 295 (internal designation 181-20-3-0-B5-CS) 



The protein of SEQ ID NO:295 .encoded by the cDNA of SEQ ID NO:54, shows homology to 
the rat, bovine, and human uromodulin precursor, Tamm-Horsfall urinary glycoprotein, and thuman 
pancreatic secretory granule membrane major glycoprotein GP2 precursor. SEQ ID NO:295 exhibits 
5 homology in the 5' region (over 40% identical and 60% similar) to both GP2 and uromodulin. Like 
GP2 and uromodulin, the homologous segment contains EGF-like calcium-binding domains, several 
potential disulfide bonds, and a number of potential N-linked glycosylation sites. Calcium binding 
EGF-like domains contain a calcium-binding site at the N-terminus, and have been found in proteins 
which require calcium for their biological activity. Non-limiting examples of proteins which contain 

10 calcium-binding EGF-like domains include: (1) Coagulation Factors X, VII, DC; (2) LDL receptors; (3) 
thrombomodulin; and (4) fibrillin- 1. Downing et al [Cell 85:597-605 (1996)] described disease- 
causing mutations that destabilized a covalently-1 inked pair of Ca 2+ -binding EGF-like domains in 
fibrillin-1 (associated with Marfan Syndrome). These domains form a rigid rod-like arrangement, 
stabilized by interdomain calcium binding and hydrophobic interaction. Uromodulin (URO) is a 90- 

15 100 KDa glycoprotein synthesized by epithelial cells of the ascending loops of Henle and convoluted 
tubule of the bladder. Except for glycosylation, URO is identical to Tamm-Horsfall protein (THP), the 
most abundant protein in normal human urine. The relative abundance and specific nephronal location 
of URO suggests that it may have important physiologic functions in the urinary system. 

URO has also been found to be an immunosuppressive glycoprotein, inhibiting antigen- 

20 induced human T-cell proliferation. More recent studies have shown that URO can trigger the 
inflammatory response of neutrophils and stimulate human mononuclear cells to proliferate and 
release cytokines and gelatinase. 

Uromodulin has been shown to play a role in regulating the circulating activity of cytokines 
since it binds to recombinant interleukin -1 and -2 and tumor necrosis factor (TNF) with high 

25 affinity. Although URO does not inhibit the cytotoxic activity of TNFa as monitored by lysis of 

tumor cell targets, it interacts with recombinant TNFa via carbohydrate chains. This interaction may 
be critical in promoting clearance and/or reducing in vivo toxicity of TNFa and other lymphokines. 
Endotoxic shock and sepsis are caused by cytokines IL-1 and TNFa. Since URO appears to exhibit 
inhibitory activity against IL-1 and TNFa, URO may be effective as a therapeutic agent against 

30 these conditions. Uromodulin has also been implicated as a possible inhibitor of certain types of 
bacterial infection in the bladder and urinary tract. URO has the ability to bind to type 1 pilus of 
Escherichia coli and prevent attachment to the surface of epithelium. 

SEQ ID NO:295 also has homology to the glycoprotein GP-2. GP-2 is an integral protein 
of the pancreatic zymogen granule membrane. GP2 is anchored to the lipid bilayer via a glycosyl 

35 phosphatidylinositol (GPI) linkage and released by a calcium-activated enzyme into the content of 
the zymogen granule. Through the process of exocytosis, GP2 is discharged into the pancreatic 
duct. The protein is also soluble in the zymogens stored in the granule, secreted by the pancreas, 
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and detected in the pancreatic secretions. GP2 appears to play a role in progression of pancreatitis, 
an inflammation of the pancreas accompanied by autodigestion of pancreatic tissue by its own 
enzymes. After cloning and sequencing of GP2, a search of the Genbank database revealed one 
homologous protein, namely uromodulin. Studies reveal that GP2 and URO not only share 
5 structural homology, but functionally are similar in that both can form ductal precipitates under 
pathological conditions. The aggregation of these precipitates in the pancreas may lead to 
obstruction of the pancreatic ducts and play a critical role in development of pancreatitis. Similarly, 
aggregation of URO in the kidney may lead to blockage of the renal tubules and result in renal 
disease. 

10 The subject invention provides the protein of SEQ ID NO:295 and polynucleotide 

sequences encoding SEQ ID NO:295. Also included in the invention are biologically active 
fragments of the protein encoded by SEQ ID NO:295 and polynucleotide sequences encoding these 
biologically active fragments. "Biologically active fragments" are defined as those peptide or 
polypeptide fragments of SEQ ID NO:295 which have at least one of the biological functions of the 

15 full length protein (e.g., the ability to chelate calcium, bind to E. coli pili, or cause 

immunomodulation of an individual). In one embodiment, the polypeptides of SEQ ID NO:295 are 
interchanged with the polypeptides encoded by the human cDNA of clone 181-20-3-0-B5-CS. 

The invention also provides variants of SEQ ID NO:295. These variants have at least about 
80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 

20 sequence identity to the amino acid sequence of SEQ ID NO:295. Variants according to the subject 
invention also have at least one functional or structural characteristic of SEQ ID NO:295, such as 
the biological functions described above or EGF-like calcium-binding domains. The invention also 
provides biologically active fragments of the variant proteins. Unless otherwise indicated, the 
methods disclosed herein can be practiced utilizing SEQ ED NO:295, or variants thereof. Likewise, 

25 the methods of the subject invention can be practiced using biological fragments of SEQ ID NO:295 
, or variants of said biologically active fragments. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode SEQ ID NO:295. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences which encode proteins having the same, or essentially the same, amino 

30 acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same sequence" refers to sequences that have amino 
acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 

35 code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 
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SEQ ID NO: 295, and variants thereof, can be used to produce antibodies according to 
methods well known in the art. The antibodies can be monoclonal or polyclonal. Antibodies can 
also be synthesized against fragments SEQ ID NO:295 as well as variants of SEQ ID NO:295 
according to known methods. The subject invention also provides antibodies which specifically 
5 bind to biologically active fragments of SEQ ID NO:295 or biologically active fragments of 
variants of SEQ ID NO:295. 

The subject invention also provides for immunoassays which are used to screen for, 
monitor, or diagnose conditions or disorders associated with liver dysfunction and/or damage. 
These conditions or disorders include, and are not limited to, hepatitis, cirrhosis, fibrosis, 

10 pericholangitis, portal triaditus, chronic periportal inflammation, systemic lupus erythematosus, 
Hodgkin's disease, Granulomas, and cell dysplasia can also be diagnosed. For a number of 
disorders listed above, expression of these genes at significantly higher or lower levels can be 
routinely detected in certain liver tissues or cell types (e.g., cancerous) or bodily fluids (e.g., serum, 
plasma, and blood) taken from an individual having such a disorder, relative to the standard gene 

15 expression levels, e.g., the expression level in healthy tissue or bodily fluid from one or more 
individuals not having the disorder. These types of assays allow for a non-invasive method of 
screening for, diagnosing, or monitoring liver cancer in human subjects. Similarly, antibodies and 
small molecules directed to the polypeptides can be used as immunological probes for differential 
identification of the diseased tissue(s) or cells. 

20 Additionally, nucleic acid and amino acid sequences of SEQ ID NOs:54 and 295 can be 

used to provide polypeptides and biologically active fragments thereof for the repair of cellular 
injury following liver damage and/or liver transplant. 

Furthermore, polypeptides, or biologically active fragments thereof, can be used for the 
modulation of bacterial binding to epithelial cells or as a modulator of bacterial infection. In this 

25 aspect of the subject invention, bacterial cells are contacted with an amount of a composition 

comprising the polypeptide, or biologically active fragments thereof, sufficient to interfere with the 
binding of bacteria to epithelial cells. In one embodiment, the bacteria are coliform bacterial cells. 
In another embodiment, the bacterial cells are E. coli. Compositions comprising SEQ ID NO:295, 
or biologically active fragments thereof, can be administered in any fashion required to provide a 

30 therapeutic effect (e.g., orally, intravenously, intrathecally, intraarterially, etc.). 

The subject invention also provides materials and methods for the treatment of endotoxic 
shock and/or sepsis. In this embodiment, a subject can be treated with therapeutically effective 
amounts of a composition comprising SEQ ID NO:295, or biologically active fragments thereof. 
The subject invention also provides materials and methods for the in vivo or in vitro 

35 chelation of calcium ions (Ca 2+ ). In this aspect of the invention, SEQ ID NO: , or biologically 
active fragments thereof, can be used to bind free Ca 2+ by addition of the polypeptide, or 
biologically active fragments thereof, to solutions, environmental samples, or biological samples. 
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Alternatively, a composition containing the SEQ ID NO:295, or biologically active fragments 
thereof, can be added to the solutions, environmental samples, or biological samples in amounts 
sufficient to bind and remove free Ca 2+ from solution. 

In another aspect of the subject invention, SEQ ID NO:295, or biologically active fragments 
5 thereof, can be used to modulate the immune system of a mammal. In this method, 

immunomodulatory amounts of SEQ ID NO:295, or biologically active fragments thereof, can be 
administered to a mammal in a pharmaceutically acceptable carrier. Methods of assessing the 
stimulated state of the immune system of the mammal can be practiced according to methods well 
known in the art. 

10 Protein of SEP ID NOs:244. 251 (internal designation numbers 105-016-3-0-G10-CS and 105-074- 
3-0-H10-CS) 

The 274 amino acid protein of SEQ ID NO:244, encoded by the cDNA of SEQ ID NO:3, 
found in prostate and strongly expressed in the salivary gland, presents strong sequence similarities 
with the yeast putative mitochondrial carrier protein PET8 (SW1SSPROT accession number 

15 P3 8921) and with similar proteins conserved among eukaryotes (D. melanogaster and C. elegans: 
respective SPTREMBLNEW SPTREMBL SWISSPROT accession numbers Q9VBN7 and Q18934, 
and S.pombe: SWISSPROT accession number: Q10442). All members of the mitochondrial 
carrier/transport protein superfamily exhibit sequence motifs highly similar to P-X-D/E-X-X-K/R- 
X-R that are also found in 3 positions in the protein of the invention (positions 26 to 33, 108 to 115 

20 and 199 to 206) (Belenkiy et al 9 Biochim. Biophys. Acta, 1467:207-21 8 (2000)). These 
mitochondrial carrier protein signatures are associated with membrane-spanning segments 
(Belenkiy et al, ibid; Kuan et Saier, Crit. Rev. Biochem. Mol. Biol., 28:209-233 (1993)). In fact, 4 
candidate membrane-spanning segments are identified in the protein of the invention, from amino 
acid positions 4 to 24, 51 to 71, 180 to 200 and 240 to 260. Other hydrophobic regions are found in 

25 positions 86 to 107 and 139 to 162. In addition, the protein of SEQ ID NO:244 presents a putative 
signal peptide in its very amino-terminal part (position 5 to 19). 

The protein of SEQ ID NO:25 1 , encoded by the cDNA of SEQ ID NO: 1 0, is a 72 amino 
acid truncated form of the protein of SEQ ID NO:244. This shorter product results from the 
absence, in the cDNA of SEQ ED NO: 10, of the 1 lObp exon (position 275 to 384) found in the 

30 cDNA of SEQ ID NO:3. Nevertheless, the 72 amino acid encoded protein possesses the putative 
signal peptide (position 5 to 19), the first mitochondrial carrier protein signature (position 26 to 33), 
and two candidate membrane-spanning segments (positions 4 to 24 and 51 to 71). 

Energy transduction in mitochondria requires the transport of many specific metabolites 
across the inner membrane of this eukaryotic organelle. Different types of substrate carrier proteins 

35 involved in energy transfer are found in the inner membrane. These proteins all seem to be 
evolutionary related, and constitute the mitochondrial carrier/transport proteins (MCP/MTP) 
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superfamily. Structurally, MCP/MTP proteins are typically homodimeric integral transmembrane 
polypeptides (subunit molecular weight ~30kD) that traverse the inner mitochondrial membrane six 
times with both the N- and C -termini localized to the cytosolic side of the membrane. Each 30kD 
subunit is composed of three tandem repeats of a domain of approximately one hundred residues 
5 (~10kD). This lOkD domain contains two transmembrane regions and a sequence motif highly 
similar to P-XrD/E-Xa-Xj-KyR-X^R, where X 3 is a hydrophobic residue (Kuan et Saier, Crit. Rev. 
Biochem. Mol. Biol., 28:209-233 (1993)). Five protein families of known function have been 
identified among the mitochondrial carrier protein superfamily: 

( 1 ) The ADP, ATP carrier protein ( ACC), ADP/ATP translocases, which under the 

10 conditions of oxidative phosphorylation catalyze the one to one exchange of cytosolic ADP against 
matrix ATP across the inner mitochondrial membrane (Fiore et al, Biochimie, 80:137-150 (1998)). 
The ADP/ATP transport system can be blocked very specifically by two families of inhibitors: 
atractyloside (ATR) and carboxyatractyloside (CATR) on one hand, and bongkrekic acid (BA) and 
isobongkrekic acid (isoBA) on the other hand. It is well established that these inhibitors recognise 

15 two different conformations of the carrier protein, the CATR- and BA-conformations, which exhibit 
different chemical, immunochemical and enzymatic reactivities. Bakker and collaborators have 
reported that myopathies might result from a defect in ADP/ATP transport (Bakker et al, Pediatr. 
Res. 33:412-417 (1993)). Namely, the authors describe a 4-fold decrease in the concentration of the 
ADP/ATP carrier protein in a patient with a mitochondrial myopathy. 

20 (2) The 2-oxoglutarate/malate carrier protein (OGCP), which exports 2-oxoglutarate 

into the cytosol and imports malate, or other dicarboxylic acids, into the mitochondrial matrix. This 
protein plays an important role in several metabolic processes, such as the malate/aspartate and the 
oxoglutarate/isocitrate shuttles (Palmieri et al, J. Bioenerg. Biomembr. 25:493-501 (1993)). 

(3) The phosphate group carrier protein, which transports inorganic phosphate 

25 groups from the cytosol into the mitochondrial matrix (Palmieri et al, ibid). 

(4) The mammalian brown fat uncoupling proteins, such as UCP-1 (thermogenin), 
are transmembrane proton-translocating proteins present in the mitochondria of brown adipose 
tissue, a specialized tissue which functions in heat generation and energy balance ((Jezek and 
Garlid, Int. J. Biochem. Cell. Biol. 30:1 163-1 168 (1998); Klingenberg, J. Bioenerg. Biomembr. 

30 31:419-430 (1999); Nicolls and Locke, Physiol. Rev. 64:2-40 (1994); Rothwell and Stock, Nature 
281:31-35 (1979)). Mitochondrial oxidation of substrates is accompanied by proton transport out of 
the mitochondrial matrix, creating a transmembrane proton gradient. Typically, re-entry of protons 
into the matrix via ATP synthase is coupled to ATP synthesis. However, UCP-1 functions as a 
transmembrane proton transporter, permitting re-entry of protons into the mitochondrial matrix 

35 unaccompanied by ATP synthesis. Environmental exposure to cold evokes neural and hormonal 
stimulation of brown adipose tissue, which increases UCP mediated proton transport, brown fat 
metabolic activity, and heat production. 
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Studies with transgenic models indicate that brown fat and UCP-1 play an important role in 
energy expenditure in rodents. Transgenic mice in which brown adipocyte tissue was ablated by a 
toxin coupled to the UCP-promoter developed obesity and diabetes (Lowell et al., Nature 366:740- 
742 (1993)). Obesity in these transgenic animals developed in the absence of hyperphagia, 
5 suggesting that the uncoupled mitochondrial respiration of brown fat is an important component of 
energy expenditure. In a separate transgenic mouse model, ectopic expression of UCP-1 in white 
adipose tissue of genetically-obese mice led to a significant reduction in body weight and fat stores 
(Kopecky et al., J. Clin. Invest. 96:2914-2923 (1995)). These studies indicate that activity of UCP-1 
is accompanied by energy expenditure and weight loss in rodents. Two other UCP proteins have 
10 recently been cloned. The first uncoupling protein-like protein (UCPL) or UCP-2 (59% 

homologous), is widely expressed (heart, kidney, lung, placenta and white fat) and enriched in 
tissues of the lymphoid lineage (Fleury et al., Nature Genetics 15:269-272 (1997)). The second, 
UCP -3 3 (57% homologous), is predominantly localized to skeletal muscle and brown fat (Boss et 
a/., FEBS Lett. 408:39-42 (1997)). UCP-3 has been found to be regulated by cold and thyroid 
15 hormone (Larkins et al., Biochem. Biophys. Res. Comm. 240:222-227 (1997)). 

Thermogenic protein activity, such as that found with UCP-1 , may be useful in reducing, or 
preventing the development of, excess adipose tissue, such as that found in obesity. Obesity is 
becoming increasingly prevalent in developed societies. Attempts to reduce food intake, or to 
decrease hypernutrition, are usually fruitless in the medium term because the weight loss induced by 
20 dieting results in both increased appetite and decreased energy expenditure (Leibel et al., New Engl. 
J. Med. 322:621-628 (1995)). The intensity of physical exercise required to expend enough energy 
to materially lose adipose mass is too great for many obese people to undertake on a sufficiently 
frequent basis. Thus, obesity is currently a poorly treatable, chronic, essentially intractable 
metabolic disorder. In addition, obesity carries a serious risk of co-morbities including, Type 2 
25 diabetes, increased cardiac risk, hypertension, atherosclerosis, degenerative arthritis, and increased 
incidence of complications of surgery involving general anesthesia. 

(5) The tricarboxylate transport protein (or citrate transport protein), which is 
involved in citrate-H+/malate exchange. This protein is important for the bioenergetics of hepatic 
cells as it provides a carbon source for fatty acid and sterol biosyntheses, and NAD for the 
30 glycolytic pathway (Kaplan et al, J. Biol. Chem. 268:13682-13690 (1993)). 

It is believed that the protein of SEQ ID NO:244 or part thereof is a member of the 
mitochondrial carrier/transport protein superfamily and, as such, plays a role in mitochondrial 
processes such as ADP/ATP, malate/aspartate, 2-oxoglutarate/isocitrate, citrate-H+/malate 
exchanges across the inner membrane, phosphate groups transport and physiological roles such as 
35 regulation of body weight and energy balance, muscle nonshivering thermogenesis, fever, and 
defense against the generation of reactive oxygen species. Preferred polypeptides of the invention 
are polypeptides comprising the amino acids of SEQ ID NO:244 from amino acid positions 26 to 
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33, 108 to 115 and 199 to 206 on one hand, and from positions 4 to 24,51 to 71, 86 to 107, 139 to 
162, 1 80 to 200 and 240 to 260 on the other hand. Other preferred polypeptides of the invention are 
fragments of SEQ ID NO:244 having any of the biological activities described herein. It is believed 
that the protein of SEQ ED NO:251 is a 72 amino acid truncated form of the 274 amino acid protein 
5 of SEQ ID NO:244, and corresponds to one subunit of the tripartite structure of mitochondrial 
carrier/transport proteins. Preferred polypeptides are polypeptides comprising the amino acids of 
SEQ ID NO:25 1 from positions 4 to 24, 26 to 33 and 51 to 71 . 

The activity of the protein of the invention can be assessed using cultured cells. For 
example, nucleic acids encoding the protein of SEQ ID NO:244 can be cloned into a eukaryotic 

10 vector and transfected into a population of cells. Transfected mammalian cells are then tested for 
their carrier activity e.g., the import of ADP, dicarboxylic acids, inorganic phosphate groups, or H + 
into the mitochondrial matrix, and the export of ATP, 2-oxoglutarate, tricarboxylate-H + export into 
the cytosol. These transfected cell lines may allow the development of in vitro assays for the 
identification of modulators of the carrier activity, such as atractyloside (ATR), 

15 carboxyatractyloside (CATR), bongkrekic acid (BA) and isobongkrekic acid (isoBA), which were 
described above in connection with the ADP/ ATP mitochondrial carrier. Such modulators are 
useful for the treatment of any diseases or conditions associated with the protein of the invention. 

Another embodiment of the invention relates to compositions and methods using the 
protein of the invention or part thereof to label mitochondria, or more specifically the inner 

20 mitochondrial membrane, in order to visualize any change in number, topology or 
morphology of this organelle, for example in association with a mitochondria-related 
human disorder, such as neuroleptic malignant syndrome (NMS) (Kubo et al., Forensic Sci. 
Int. 1 15:155-158 (2001)), the Rett syndrome (Armstrong, Brain Dev. 14 Suppl:S89-98 
(1992)), Alpers disease (Chow and Thorburn, Hum. Reprod. 15 Suppl 2:68-78 (2000)) or 

25 mitochondrial encephalomyopathies (Handran et al., Neurobiol. Dis. 3:287-298 (1997)). 
For example, the protein may be rendered easily detectable by inserting the cDNA 
encoding the protein of the invention into a eukaryotic expression vector in frame with a 
sequence encoding a tag sequence. Eukaryotic cells expressing the tagged protein of the 
invention may also be used for the in vitro screening of drugs or genes capable of treating 

30 any mitochondria-related disease or conditions. The protein of the invention can also be 
used to specifically label cells of the salivary gland or of the prostate, e.g. for histological 
analyses or for the identification of the origin of tumor cells. 

The protein of the invention can also be used as a carrier/transporter to translocate 
radiolabeled or chemically labeled metabolites (ADP, dicarboxylic acids, inorganic 

35 phosphate groups) from the cytosol to the matrix of the mitochondria in order to 

specifically label this organelle, e.g. to follow its modifications. For example, radiolabeled 
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or chemically labeled precursors can be added to an in vitro culture of mammalians cells 
stably transfected and expressing the protein of the invention. The labeling of the 
organelles can then be stopped at different times after the beginning of the experiment by 
adding specific inhibitors of carrier/transporter proteins, such as atractyloside (ATR), 
5 carboxyatractyloside (CATR), bongkrekic acid (BA), or isobongkrekic acid (isoBA). Cells 
with labeled mitochondria can be used for the in vitro screening of drugs or genes capable 
of causing mitochondrial modifications. 

Still another embodiment of the invention or part thereof relates to methods of 
delivering heterologous compounds, either polypeptides or polynucleotides, to the inner 

10 membrane of mitochondria by recombinantly or chemically fusing a fragment of the protein 
of the invention to a heterologous polypeptide or polynucleotide. Preferred fragments are 
the putative peptide signal, the four membrane-spanning segments and/or any other 
fragments of the protein of the invention that may contain targeting signals for 
mitochondria including but not limited to matrix targeting signals as defined in Herrman 

15 and Neupert, Curr. Opinion Microbiol. 3:210-4 (2000); Bhagwat et al. J. Biol. Chem. 
274:24014-22 (1999), Murphy Trends Biotechnol. 1 5:326-30 (1997); Glaser et al. Plant 
Mol Biol 38:31 1-38 (1998); Ciminale et al. Oncogene 18:4505-14 (1999). Such 
heterologous compounds may be used to modulate mitochondrial activities, such as to 
induce and/or prevent mitochondrial-induced apoptosis or necrosis. For example, these 

20 heterologous compounds may be used in the treatment and/or the prevention of disorders in 
which apoptosis is deleterious, including, but not limited to, immune deficiency syndromes 
(including AIDS), type I diabetes, pathogenic infections, cardiovascular and neurological 
injury, alopecia, aging, degenerative diseases such as Alzheimer's Disease, Parkinson's 
Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy, schizophrenia, 

25 and myodegenerative disorders such as "mitochondrial encephalopathy, lactic acidosis, and 
stroke" (MELAS), and "myoclonic epilepsy ragged red fiber syndrome" (MERRF). In 
addition, heterologous polynucleotides may be used to deliver nucleic acids for 
mitochondrial gene therapy, i.e. to replace a defective mitochondrial gene and/or to inhibit 
the deleterious expression of a mitochondrial gene. 

30 The invention further relates to methods and compositions used to modify the protein of the 

invention. Post-translational modifications encompassed by the invention include, N-linked or 

O-l inked carbohydrate chains, processing of N-terminal or C-terminal ends, attachment of chemical 

moieties to the amino acid backbone, chemical modifications of N-linked or O-linked carbohydrate 

chains, and addition or deletion of an N-terminal methionine residue as a result of prokaryotic host 

35 cell expression. These post-translational modifications of the protein of the invention may be very 

175 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 PCT/IB00/01938 

useful in the search for its putative protein partners, using approaches such as screening of an 
expression cDNA library with a radiolabeled recombinant protein, as post-translational 
modifications are of first importance in protein-protein interactions. Identification of proteinic 
partners of mitochondrial carrier proteins would allow the study of their regulation ex vivo and in 
5 vivo in normal versus pathologic cases (for an example concerning the UCP1 mitochondrial carrier 
protein and its 14.3.3 physical partner, see: Pierrat et al., Eur. J. Biochem. 267:2680-2687 (2000)). 

Another embodiment of the invention relates to composition and methods using 
polynucleotide sequences encoding the protein of the invention or part thereof to establish 
transgenic model animals (D. melanogaster, M. musculus), by any method familiar to those skilled 

10 in the art. By modulating in vivo the expression of the transgene with drugs or modifier genes 
(activator or suppressor genes), animal models can be developed that mimic human mitochondria- 
associated disorders such as myopathies or obesity. These animal models would thus allow the 
identification of potential therapeutic agents for treatment of the disorders. In addition, recombinant 
cell lines derived from these transgenic animals may be used for similar approaches ex vivo. 

1 5 In one embodiment, the protein of SEQ ID NO:25 1, corresponding to the 72 amino acid 

truncated form of SEQ ID NO:244, may be used as a dominant negative variant to inhibit the 
function of the full-length form of the protein of SEQ ID NO:244 in vitro or in vivo. Inactivation of 
mitochondrial carriers in this way may allow the development of animal models for human 
disorders. Recently, for example, Lowell and collaborators have shown in the mouse that a targeted 

20 destruction of UCP1 by the diphteria toxin A chain is able to produce obese animals (Kozak and 
Koza, ibid., Lowell et al., ibid.). 

Protein of SEP ID NO: 285 ( internal designation 174-39-2-0-A3-CS) 

The protein of SEQ ID NO:285, encoded by the cDNA of SEQ ID NO: 44 (clone 174-39-2- 
0-A3-CS), is overexpressed in cancerous prostate, fetal brain, muscle and placenta. The protein is 
25 homologous to the NADH -cytochrome b5 reductase isoform and to the human electron transport 
protein. 

NADH-cytochrome b5 reductase proteins belong to a flavoenzyme family sharing common 
structural features and whose members (ferrodoxin-NADP+ reductase, NADPH-cytochrome P450 
reductase, NADPH-sulfite reductase, NADH-cytochrome b5 reductase and NADH-nitrate 

30 reductase) are involved in photosynthesis, in the assimilation of nitrogen and sulfur, in fatty-acid 
oxidation, in the reduction of methemoglobin and in the metabolism of many pesticides, drugs and 
carcinogens (Karplus et al., Science, 251 :60-6 (1991)). In addition, cytochrome b5 reductase is 
thought to play a role in the prevention of apoptosis following oxidative stress (see review by 
Villalba et al., Mol Aspects Med 18 Suppll:S7-13 (1997)). 

35 It is believed that the protein of SEQ ID NO: 285 may be an oxidoreductase. Thus it may play 

a role in electron transport and general aerobic metabolism and may be associated with mitochondrial 
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membranes. In addition, the protein of the invention may be able to use FAD and/or molybdopterin as 
cofactors. It may be involved in photosynthesis, in the assimilation of nitrogen and sulfur, in fatty-acid 
oxidation, in the reduction of methemoglobin and in the metabolism of many pesticides, drugs and 
carcinogens. Preferred polypeptides of the invention are fragments of SEQ ID NO: 285 having any of 
5 the biological activity described herein. The oxidoreductase activity of the protein of the invention may 
be assayed using any technique known to those skilled in the art. The ability to bind a cofactor may 
also be assayed using any techniques well known to those skilled in the art including, for example, the 
assay for binding NAD described in US patent 5,986,172. 

In another embodiment, the protein of the invention or part thereof is used to prevent cells 

1 0 from undergoing apoptosis. In a preferred embodiment, the apoptosis active polypeptide is added to 
an in vitro culture of mammalian cells in an amount effective to reduce apoptosis. Furthermore, the 
protein of the invention or part thereof may be useful in the diagnosis, the treatment and/or the 
prevention of disorders in which apoptosis is deleterious, including but not limited to immune 
deficiency syndromes (including AIDS), type I diabetes, pathogenic infections, cardiovascular and 

15 neurological injury, alopecia, aging, degenerative diseases such as Alzheimer's Disease, Parkinson's 
Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy, schizophrenia, and 
myodegenerative disorders such as "mitochondrial encephalopathy, lactic acidosis, and stroke" 
(MELAS), and "myoclonic epilepsy ragged red fiber syndrome" (MERRF). 

The invention further relates to methods and compositions using the protein of the invention 

20 or part thereof to diagnose, prevent and/or treat several disorders in which energy metabolism is 
impaired, or needs to be impaired, including but not limited to mitochondriocytopathies, necrosis, 
aging, neurodegenerative diseases, myopathies, methemoglobinemia, hyperlipidemia, obesity, 
cardiovascular disorders and cancer. For diagnostic purposes, the expression of the protein of the 
invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 

25 methods described herein and compared to the expression in control individuals. For prevention 
and/or treatment purposes, the protein of the invention may be used to enhance electron transport 
and increase energy delivery using any of the gene therapy methods described herein. 

Protein of SEP ID NO:368 (internal designation 1 87-45-0-0-1 18-CS) 

The protein of SEQ ID NO: 368 encoded by the cDNA of SEQ ID NO: 127 is a 78 amino 

30 acids long polypeptide. The sequence of the protein of SEQ ED NO: 368 is identical to the sequence 
of the human Dadl protein, the defender against apoptotic cell death 1 protein, a subunit of the 
mammalian oligosaccharyl transferase (OST), except that the last 43 residues of Dadl are replaced 
by a series of 8 different amino acids in the protein of the invention. In addition, the protein of SEQ 
ID NO: 368 displays the pfam signature for DAD family proteins from positions 1 to 78 as well as 

35 two putative transmembrane domains from positions 31 to 51 and 54 to 74. The Dadl protein is a 
1 13 amino acids long protein which mRNA is composed of 3 exons [see Genbank accession 
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number D15057 and Nakashima, T. et al (1993) Molecular and Cellular Biology 13:6367-6374]. 
The cDNA of SEQ ID NO: 127 is composed of the first and third exon of the Dadl cDNA wheras 
the second exon of the Dadl cDNA is missing. Taken together, these datas indicate that the protein 
of SEQ ID NO: 127 is a new isoform of the Dadlprotein resulting from an alternative splicing 
5 event. 

Asparagine-linked glycosylation is a highly conserved protein modification reaction that 
occurs in all eukaryotes. The initial stage in the biosynthesis of N-linked glycoproteins, catalysed by 
the enzyme oligosaccharyltransferase (OST), involves the transfer of a preassembled high-mannose 
oligosaccharide from a dolichol-linked oligosaccharide donnor onto asparagine acceptor sites in 

10 nascent proteins in the lumen of the rough endoplasmic reticulum [ for review, see Silberstein, S. et 
al (\996) FASEB J 10: 849-858]. 

Protein glycosylation is essential for the structure and function of many proteins and is 
involved in the control of many diverse biological processes (Paulson, Trends in Biol Sci., 1989, 
14, 272; Sadler, In Biology of Carbohydrates, 2nd Ed., Ginsburg & Robbins, Ed., John Wiley & 

15 Sons: New York, 1984, Vol. 2, pg. 87). For example, protein glycosylation has been found to be 
crucial for the development, growth and proper function of complex organisms, while the aberrant 
glycosylation of proteins has been associated with diseased and transformed cells. 

The mammalian oligosaccharyltransferase is composed of the four ER membrane proteins, 
ribophorin I and II (RI and RII), OST48, and DAD1, which form an oligomeric complex. RI and 

20 OST48, and probably also RII, are type I transmembrane proteins. The luminal domain of OST48 
interacts with those of RI and RII and the cytoplasmic domain of OST48 has affinity for the 
cytoplasmically exposed N-terminal tail of DAD1 [Kelleher, D. et al. (1997) Proc Natl Acad. Sci. 
USA 94: 4994-4999; Fu, J. et al. (1997) J. Biol. Chem 272: 29687-29692]. 

Dadl is a small hydrophobic protein, thought to be an integral membrane protein, with a 

25 cytoplasmically located N terminus and up to three transmembrane domains. As is true for the other 
subunits of OST, the precise role of Dadl in N-glycosylation is not known. However, it has been 
shown that Dadl is critical for the function and the structural integrity of the OST complex [Sanjay, 
A. et al (1998) J. Biol Chem 273: 26094-26099]. Also, it is worth noting that the Dadl protein was 
first identified in 1993 as a mammalian cell death suppressor since loss of its function induces 

30 apoptosis in hamster BHK2 1 cells [Nakashima, T. et al ( 1 993) Molecular and Cellular Biology 
13:6367-6374]. Since then, several reports have confirmed the anti-apoptotic role for Dadl [Hong, 
NA. et al, (2000) Dev Biol 220:76-84; Brewster, JL. et al, (2000) Genesis 26: 271-8); Yoshimi, M. 
et al, (2000) Biochem Biophys Res Commun 276: 965-9]. 

Dadl is a highly conserved protein whose sequence has been determined for diverse 

35 organisms including several vertebrates, a nematode, and several plants. A comparaison of these 
sequences reveals that the amino-terminal region preceding the first membrane-spanning segment is 
the least conserved region of the protein both with repect to lengh and amino acid sequence identity. 
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The most highly conserved sequences of Dad 1 include the second and third membrane spanning 
segments, making them probably the most crucial regions for Dadl function [Kelleher, D. et al. 
(1997) Proc Natl. Acad. Sci. USA 94: 4994-4999]. The importance of the C-terminus region for 
mediating Dadl functions has been recently confirmed [Makishima, T. et al, (2000)7. Biochem 
5 (Tockyo) 128:399-405] 

Therefore, Dadl is thought to act as a positive regulator of the oligosaccharyltransferase 
complex, and as a negative regulator of apoptosis. In addition, the C-terminus of the Dadl protein 
seems to be important for mediating these functions. As mentionned above, the protein of the 
invention is a new isoform of the Dadl protein resulting from an alternative splicing event. As a 

10 result of this alternative splicing event, the C-terminus of the protein of the invention is shortened 
and does not display the third transmembrane domain of Dadl. Since the C-terminus of Dadl has 
been shown to be important for mediating the protein function, it is believedthat the protein of the 
invention has rather an antagonistic action to the one of Dadl . It is worth noting that this type of 
situation in which the same gene give rise by alternative splicing to different protein products with 

15 opposing functions is a commun theme among apoptosis genes [For a review, see Reed, JC. (1999) 
Nat. Biotechnol 17: 1064-65 ]. 

Thus, it is believed that the protein of the invention of SEQ ID NO: 368 plays a role in the 
control of N-glycosylation of celullar proteins. Preferably, the protein of the invention is thought to 
act as a positive regulator of apoptosis and a negative regulator of the OST complex. Preferred 

20 polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID NO: 368 

from positions 1 to 78, and 71 to 78. Other preferred polypeptides of the invention are fragments of 
SEQ ID NO: 368 having any of the biological activity described herein. The activity of the protein 
of the invention or part thereof on protein N-glycosylation may be assayed using any of the assays 
known to those skilled in the art. For example, one could use DNA-mediated gene transfer 

25 techniques in order to introduce the cDNA sequence of SEQ ID NO: 127 or part thereof into cell 
lines so that the protein of SEQ ID NO: 368 or part thereof is over expressed in these cell lines. The 
resulting effect of this over expression on the N-glycosylation of proteins can then be studied using 
immunoblotting or Western blotting of glycoproteins [Makishima et al. (1997) Genes Cells 2: 129- 
141; Silberstein et al. (1995) J. Cell. Biol: 131: 371-383; Hong et al. (2000) Developmental Biology 

30 220: . The activity of the protein of the invention or part thereof on cellular apoptosis may be 
assayed using any of the assays known to those skilled in the art including those described by 
Nakashima et al. (1993) supra. 

One object of the present invention are compositions and methods of targeting heterologous 
polypeptides to the endoplasmic reticulum by recombinantly or chemically fusing a fragment of the 

35 proteins of the invention to an heterologous polypeptide. Preferred fragments are any fragments of the 
proteins of the invention, or part thereof, that may contain targeting signals for the endoplasmic 
reticulum such as those described in Pidoux AL, Armstrong EMBO J 1 992 Apr; 1 1 (4): 1583-91 ; Munro 
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S, Pelham HR Cell 1987 Mar 13;48(5):899-907; Pelham HR Trends Biochem Sci 1990 
Dec;15(12):483-6. 

In another embodiment, the invention relates to compositions and methods using the protein 
of the invention or part thereof to stimulate cells'entry into apoptosis. In a preferred embodiment, 
5 the pro-apoptosis protein of the invention or part thereof is added to an in vitro culture of 
mammalian or plant cells in an amount effective to stimulate apoptosis. In another preferred 
embodiment, the cDNA sequence of SEQ ID NO: 127 or part thereof may be used to create 
transgenic animals or plant cells in which the disclosed protein of the invention or part thereof can 
be expressed at higher levels than normal whenever and wherever it is desired. Ways to create 
10 transgenic cells in which the expression of the transgene can be turn on or off whenever it is desired 
are well known in the art. Increasing the expression level of the protein of the invention in cells to 
stimulate programmed cell death may be useful for applications in which a given species of cells 
become undesirable upon a given event, i.e., infection, transformation, end of a production process, 
etc... 

15 Furthermore, the invention relates to methods and compositions using the protein of the 

invention or part thereof to diagnose, prevent and/or treat disorders characterized by abnormal cell 
proliferation and/or programmed cell death, including but not limited to cancer, immune deficiency 
syndromes (including AIDS), type I diabetes, pathogenic infections, cardiovascular and 
neurological injury, alopecia, aging, degenerative diseases such as Alzheimer's Disease, Parkinson's 

20 Disease, Huntington's disease, dystonia, Leber's hereditary optic neuropathy, schizophrenia, and 
myodegenerative disorders such as "mitochondrial encephalopathy, lactic acidosis, and stroke" 
(MELAS), and "myoclonic epilepsy ragged red fiber syndrome" (MERRF). For diagnostic 
purposes, the expression of the protein of the invention could be investigated using any of the 
Northern blotting, RT-PCR or immunoblotting methods described herein and compared to the 

25 expression in control individuals. For prevention and/or treatment purposes of disorders in which , 
cell proliferation needs to be reduced and/or apoptosis increased, the expression of protein of the 
invention may be enhanced using any of the gene therapy methods described herein or known to 
those skilled in the art. For prevention and/or treatment purposes of disorders in which cell 
proliferation needs to be enhanced and/or apoptosis reduced, inhibition of endogenous expression of 

30 the protein of the invention may be achieved using any methods or known to those skilled in the art 
including the triple helix and antisense strategies described herein. 

Moreover, antibodies to the protein of the invention or part thereof may be used for 
detection of the endoplasmic reticulum for histological purposes using any techniques known to 
those skilled in the art. 
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Protein of SEP ID No: 284 (internal designation 174-38-3-0-C9-CS) 



The protein of SEQ ED No: 284 encoded by the cDNA of SEQ ID No: 43 is overexpressed 
in salivary gland. The 406-amino-acid-long protein of invention, which is similar in size to 
fucosyltransferases, displays a Pfam motif of the fusosyltransferase family from residues 70 to 406. 
5 Furthermore, the present protein of invention is homologous to a putative fucosyltransferase of 
Drosophila melanogaster (STR accession number: Q9VLC1 and Q9VLC1). The protein of SEQ ED 
284 also shares homology with the alphal,3 fucosyltransferase (E.C. 2.4.1.152), found in 
Brachydanio renio (EMBL accession number : AB023627), Schistosoma mansoni (GENPEPT 
accession number : API 83577-1), cattle (SPTREMBL accession number Q9TQQ3), and human 

10 species (GENPEPT accession number : AJ132772_2). Like fucosyltransferases, the protein of the 
invention displays the features of type II transmembrane proteins with a short N -terminal 
cytoprasmic tail; a 9-29 amino acid signal-anchor transmembrane, domam^ and aTarge C-terminal 
domain. Furthermore, the present protein of invention displays an almost perfect consensus motif of 
the alpha-1,3 fucosyltransferases from residues 315 to 345 (Breton et al. Glycobiology 1998; 1: 87- 

15 94). 

Fucosyltransferases are a family of enzymes that catalyze the transfer of fucose from GDP- 
fucose, to galactose in an alphal,2 linkage, and to N-acetylglucosamine in alphal,3-, alphal,4- and 
alphal,6- linkages. Since all fucosyltransferases use the same nucleotide sugar, their specificity will 
probably reside in the recognition of the acceptor and in the type of linkage formed. In human 

20 species, fucosyltransferases, which are type II membrane proteins found in Golgi, can be split into 
three distinct families (Breton et al. Glycobiology 1998; 1: 87-94): (1) the alpha-1,2- 
fucosyltransferases, hFUTl and hFUT2, which yield nearly identical products as only single 
carbohydrate linkage differentiates type I from type II glycans. hFUTl determines the expression of 
O-type antigen (H antigen) of the ABO blood group system on erythrocytes, whereas hFUT2 (Se) 

25 determines it in saliva, i.e. secretor status; (2) The alpha- 1,3 -fucosyltransferases that constitute a 
distinct homogenous family of proteins, although some regions display similarities with the alpha- 
1,2 and alpha- 1 ,6-fucosyltranferases (Breton et al. Glycobiology 1998;1:87-94). Five alpha -1,3- 
fucosyltransferases have been characterized to date in the human species, i.e. hFUT3 (Lewis 
enzyme), hFUT4 (myeloid-type), hFUT5, hFUT6 (plasma-type), and hFUT7. These are involved in 

30 the lasts steps of the biosynthesis the carbohydrate antigen sialyl Lewis of ABH (de Vries et al. J 
Biol Chem 1995;270:8712-22 ; Kimura et al. Biochem Biophys Res Commun 1997 8;237: 131-7) ; 
(3) The alpha- 1 ,6-fucosyltransferase, hFUT8, which is implicated in the synthesis of N-glycans 
(Miyoshi et al. Biochim Biophys Acta 1999;1473:9-20). 

The fucosylated cell surface glycoconjugates play important roles in physiological and 

35 pathological processes, such as fertilization, embryogenesis, lymphocyte trafficking, immune 

response, and cancer metastasis (Staudacher et al. Trends Glycosci Glycotechnology 1996; 8:391- 
408). More specifically, the fucosylated cell surface glycoconjugates, which are present on the 
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apical surface of various epithelium, contribute to resistance of various microorganisms agents 
including bacteria as Helicobacter pylori (Umesaki et al. Science 1997;276:964-5), and E. coli 
(Vogeli et al. Schweiz Arch Tierheilkd 1997;139:479-84), and virus such as HIV (Ali et al. Infect 
Dis 2000;181:737-9). On the other hand, abnormal upregulation of fucosyltransferases is a common 
5 finding in various types of tumors, which cause an increased production of fucosylated 

glycoconjugates. Such fucosylated glycoconjugates can also serve as tumor markers and include (1) 
the Cal9-9 cancer antigen, which circulating sialyl-Lewis a structure produced by hFUT3 and used 
for diagnosis of pancreatic and gastric cancer (Koprowski et al. Somatic Cell Genet 1979;5:957-71), 
and (2) alpha-foe toprotein whose alpha 1 ,6-fucosylatation is reduced in hepatoma (Miyoshi et al. 

10 Biochim Biophys Acta. 1999;1473:9-20).On the other hand, aberrant production of fucosylated 
glycoconjugates can provide selective growth advantage by facilitating the extravasation of tumor 
cells, since they participate to endothelial adhesion through interaction with E- and P- selectins of 
endothelial cells (Butcher and Picker, Science 1996;272:60-6). Consequently, modulation of 
fucosyltransferase activity can modify tumorogenicity in various model of tumors including 

15 hepatoma (Miyoshi et al. Biochim Biophys Acta. 1999;1473:9-20), and colorectal adenocarcinoma 
(Weston et al. Cancer Res 1999;59:2127-35). 

Thus, it is believed that the protein of the invention of SEQ ED NO: 284 is a 
glycosyltransferase, preferably an hexosyltransferase, more preferably a fusosyltransferase, even 
more preferably an alpha- 1,3 -fucosyltransferase, and as such plays a role in fertilization, 

20 embryogenesis, lymphocyte trafficking, immune response, cancer metastasis and resistance to 
various microorganisms. Preferred polypeptides of the invention are polypeptides comprising the 
amino acids of SEQ ID NO: 284 from positions 70 to 406, and 3 1 5 to 345. Other preferred 
polypeptides of the invention are fragments of SEQ ID NO: 284 having any of the biological 
activity described herein. The glycosyltransferase activity of the protein of the invention or part 

25 thereof may be assayed using any of the assays known to those skilled in the art including those 
described in (Palcic et al. Carbohydr Res 1990;196:133-40). 

Fucosylated compounds have considerable potential both as therapeutics and as reagents for 
clinical assays. However, synthesis of glycosylated compounds of potential commercial and/or 
therapeutic interest is difficult because of the very nature of the saccharide subunits. A multitude of 

30 positional isomers in which different substituent groups on the sugars become involved in bond 
formation, along with the potential formation of different anomeric forms, are possible. As a result 
of these problems, large scale chemical synthesis of most carbohydrates is not possible due to 
economic considerations arising from the poor yields of desired products. Enzymatic synthesis 
using glycosyl transferases such as fucosyltransferase provides an alternative to chemical synthesis 

35 of carbohydrates. Enzymatic synthesis using glycosidases, glycosyl transferases, or combinations 
thereof, have been considered as a possible approach to the synthesis of carbohydrates. As a matter 
of fact, enzyme-mediated catalytic synthesis would offer dramatic advantages over the classical 
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synthetic organic pathways, producing very high yields of carbohydrates economically, under mild 
conditions in aqueous solutions, and without generating notable amounts of undesired side products. 
To date, such enzymes are however difficult to isolate, especially from eukaryotic, e.g., mammalian 
sources, because these proteins are only found in low concentrations, and tend to be membrane- 
5 bound. In addition to being difficult to isolate, the acceptor (peptide) specificity of glycosyl 
transferases is poorly understood. Thus, there is a need for obtaining recombinant glycosyl 
transferase, including fucosyltransferases, that could be produced in very large amounts. 

Thus, the invention related to methods and compositions using the protein of the invention 
or part thereof to synthesize glycosylated compounds, either glycoproteins, glycolipids, or 

10 oligosaccharides, more particularly fucosylated compounds. If necessary, the protein of the 
invention or part thereof may be produced in a soluble form by removing its transmembrane 

domains and/or its Golgixeterition^gnal using any of the methods skilled in the art including those 

described in US patent 5,776,772. For example, the protein of the invention or part thereof is added ~ 
to a sample containing GDP-fucose and a substrate compound in conditions allowing glycosylation, 

15 more particularly fucosylation and allowed to catalyze the glycosylation of this compound. In a 
preferred embodiment, the enzymatic reaction carried out by the protein of the invention is part of a 
series of other chemical and/or enzymatic reactions aiming at the synthesis of complex glycosylated 
compounds, such as the ones described in US patents 5,409,817 and 5,374,541. In another preferred 
embodiment where the method is to be practiced on a commercial scale, it may be advantageous to 

20 immobilize the glycosyltransferase on a support. This immobilization facilitates the removal of the 
enzyme from the batch of product and subsequent reuse of the enzyme. Immobilization of 
glycosyltransferases can be accomplished, for example, by removing from the transferase its 
membrane-binding domain, and attaching in its place a cellulose -binding domain. One of skill in the 
art will understand that other methods of immobilization could also be used and are described in the 

25 available literature. 

In a preferred embodiment embodiment, the present invention relates to processes and 
compositions for producing glycosylated compounds, preferably fucosylated compounds, wherein a 
cell is genetically engineered to produce the protein of the invention or part thereof and used in 
combination with one or several other cells able to produce the donor substrate for the protein of the 

30 invention. 

In another preferred embodiment, the present invention relates to a process and 
compositions for controlling the glycosylation of proteins in a cell wherein an insect, plant, or 
animal cell is genetically engineered to produce one or more enzymes that provide internal control 
of the cell's glycosylation mechanism. Preferably, the invention relates to a Chinese hamster ovary 
35 (CHO) cell line that is genetically engineered to produce a fucosyltransferase of the present 
invention either alone or in combination with other glycosyltransferases. This supplemental 
fucosyltransferase modifies the glycosylation machinery to produce glycoproteins having 
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carbohydrate structures that more closely resemble naturally occurring human glycoproteins. The 
methods for performing the above process and making the above compositions are carried out using 
the methods known in the art and described in U.S. Patent No. 5,047,335. 

Another embodiment of the present invention relates to compositions and methods using the 
5 protein or part thereof to detect fucosylated conjugates. In a preferred embodiment, the protein of 
SEQ ID No: 284 or part of thereof is used to obtain reagents, such as antibodies. These reagents 
could be used in radioimmunoassays, competitive binding assays, Western Blot analysis, enzyme- 
linked immunosorbent assay (ELISA), immunohistochemisty, or any other technique known to 
those skilled in the art (Palcic et al. Carbohydr Res 1990;196:133-40). In a preferred embodiment, 

10 antibodies raised against the present protein of invention provides tools to specifically visualize 
salivary or digestive tract tissues (and cells derived from the tissues). This can be useful for various 
applications, including the determination of the origin or identity of cells, e.g. cancerous cells, as 
well as to facilitate the identification of particular cells and tissues for, e.g. the evaluation of 
histological slides. Such assays may also be used for diagnosis in various disorders including, but 

15 are not limited to, neoplastic tumors such as salivary, prostate, liver, digestive disease tract and 
pancreas cancers. Various types of samples can be assayed, including tumor tissues, or other 
biological samples such as serum or plasma. 

The invention further relates to glycosylated compounds, preferably fucosylated 
compounds, obtained using any of the processes described herein using the protein of the invention 

20 or part thereof. Such compounds may be used in the diagnosing, prevention and/or treating of 

disorders including, but are not limited to, cancer, cystic fibrosis, ulcer, inflammation and immune 
based disorders, including autoimmune disorders such as arthritis, fertility disorders, and 
hypothyroidism. These conditions include infectious diseases where active infection exists at any 
body site, such as meningitis and salpingitis; complications of infections including septic shock, 

25 disseminated intravascular coagulation, and/or adult respiratory distress syndrome; acute or chronic 
inflammation due to antigen, antibody and/or complement deposition; inflammatory conditions 
including arthritis, cholangitis, colitis, encephalitis, endocarditis, glomerulonephritis, hepatitis, 
myocarditis, pancreatitis, pericarditis, reperfusion injury and vasculitis. Immune-based diseases 
include but are not limited to conditions involving T-cells and/or macrophages such as acute and 

30 delayed hypersensitivity, graft rejection, and graft-versus-host disease; auto-immune diseases 
including type I diabetes mellitus and multiple sclerosis. In a preferred embodiment, these 
glycosylated compounds or derivatives thereof may be used as pharmacological agents to trap 
pathogens or endogenous ligands thus reducing the binding of pathogens or endogenous ligands to 
the endogenous glycosylated compounds. For example, such compounds may be used to prevent 

35 and/or inhibit the adhesion of cancer cells to inner wall of blood vessel or aggregation between 

cancer cells and platelets, thus reducing cancer metastasis, to prevent and/or inhibit the adhesion of 
neutrophils to blood vessels endothelial cells. Other disorders include infections in which 
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recognition of a glycosylated product is essential to the development of the infection. Such 
infections include, but are not limited to those caused by Helicobacter pylori, E. coli and viruses 
such as HIV. In a preferred embodiment, such compounds, preferably oligosaccharides, are used as 
gram positive antibiotics and disinfectants (U.S. Pat. Nos. 4,85 1,338 and 4,665,060). 
5 The invention further relates to methods and compositions using the protein of the invention 

or part thereof for diagnosis, prevention and/or treatment of several disorders in which recognition 
of glycosylated compounds, preferably of fucosylated compounds, is impaired or needs to be 
impaired. For diagnostic purposes, the expression of the protein of the invention could be 
investigated using any of the Northern blotting, RT-PCR or irnmunoblotting methods described 

10 herein and compared to the expression in control individuals. For prevention and/or treatment 
purposes, inhibiting the endogenous expression of the protein of the invention may be used to 

reduce the production of glycosylated compounds ^emmental to the organism using any of the 

antisense or triple helix methods described herein as well as antagonists of the protein's activity. 
In another embodiment, various substances can be used for treatment, attenuation and/or 

15 prevention for treatment of abnormal conditions associated to unbalanced amounts and/or activity 
of the protein of SEQ ID No. 284. Such substances include, but are not limited to, chemical 
compounds such as agonists and antagonists, nucleic acids, and antibodies. In particular, the protein 
of the invention or part thereof may be used in the development of inhibitors of glycosyl transferase, 
more particularly inhibitors of fucosyltransferases, for mechanistic and clinical applications (Taylor, 

20 Curr Opin Struc Biol 1996;6:830-7 ; Colman, Pure Appl Chem 1995;67:1683-8; Bamford, Enz 
Inhib 1995;10:1-16 ; Khan & Matta, In Glycoconjugates, Composition, Structure, and Function. 
pp361-378. eds., Allen, H. J. & Kisailus, E. C. Marcel Dekker, Inc. New York, 1992 ; Thome- 
Tjomsland et al., Transplantation 2000;69:806-8 ; Basset et al., Scand J Immunol 2000;51:307-1 1). 
Such substances may be employed for treatment of a variety of therapeutic and prophylactic 

25 purposes including certain types of neoplastic disorders. For instance, substances targeted against 
the protein of SEQ ID No. 284 can be administered to treat patients affected with, but not limited to, 
salivary, prostate, liver, digestive disease tract and pancreas cancers. Alternatively, such substances 
can be used for treatment, attenuation and/or prevention of infectious disease in order to induce 
resistance of various microorganisms agents. 

30 Protein of SEP ID NO: 292 (internal designation 181-10-1-0-DlO-CS) 

The protein of SEQ ID NO:292 is encoded by the cDNA of SEQ ID NO:51. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:292 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
acid included in clone 1 81-10-1 -0-D10-CS. In addition, it will be appreciated that all characteristics 

35 and uses of the nucleic acid of SEQ ID NO:5 1 described throughout the present application also 
pertain to the nucleic acid included in clone 1 81-10-1 -0-D10-CS. 
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The protein of SEQ ED NO:292 was identified among the cDNAs from a library constructed 
from fetal liver. Tissue distribution analysis using databases indicated that mRNA encoding this 
protein was found primarily in fetal kidney and fetal liver. 

The protein of SEQ ED NO:292 is most likely a polymorphic variant (92% identity) of 
5 human secreted protein SEQ ED NO: 197 from the protein described in PCT publication WO 

9906553-A2, the disclosure of which is incorporated herein by reference in its entirety. Further, the 
protein of SEQ ID NO:292 is homologous to the C-type lectin domain of mouse macrophage 
asialoglycoprotein-binding protein (M-ASGP-BP, 36% identity), mouse natural killer (NK) cell 
surface protein PI 40 (NKR-, PI. 9, 34%) and human asialoglycoprotein receptor L-H2 from EP 
10 773289-A2 (27%). Thus, the present invention relates to nucleic acid and amino acid sequences of 
a lectin-like protein and to the use of these sequences in the diagnosis, study, prevention and 
treatment of disease. 

The protein of SEQ ED NO:292 consists of 1 1 1 amino acids. From the amino acid 
alignments and the hydrophobicity plots, it has a predicted signal peptide sequence spanning 

15 residues 12-24 and one predicted transmembrane domain spanning residues 5-25. Accordingly, one 
embodiment of the present invention is a polypeptide comprising the signal peptide or the 
transmembrane domain. 

A number of different protein families share a conserved domain which was first 
characterized in some animal lectins and which seems to function as a calcium-dependent 

20 carbohydrate-recognition domain (Drickamer K., J. Biol. Chem., 263:9557-9560,1988, the 
disclosure of which is incorporated herein by reference in its entirety). This domain, which is 
known as the C-type lectin domain (CTL) or as the carbohydrate-recognition domain (CRD), 
consists of about 1 10-130 residues. There are four cysteines that are perfectly conserved and 
involved in two disulfide bonds. Several categories of proteins can de found in which the CTL 

25 domain has been described. Both M-ASGP-BP and NKR-P1 are type II membrane proteins. Type 
II membrane proteins in which the CTL domain has been located at the C-terminal extremity 
include: 1) Asialoglycoprotein receptors (ASGPR), also known as hepatic lectins. The ASGPR's 
mediate the endocytosis of plasma glycoproteins to which the C-terminal sialic acid residue in their 
carbohydrate moieties has been removed. 2) A number of proteins expressed on the surface of NK 

30 cells, and some subsets of T cells: NKG2, NKR-P1, Ly-49, CD69, and on B cells: CD72, LyB-2. 
The CTL- domain in these proteins is distantly related to other CTL-domains, and it is unclear 
whether they all bind carbohydrates. 

M-ASGP-BP is a lectin-like molecule expressed on the surface of activated macrophages 
and specific for terminal D-galactose and N-acetyl-D-galactosamine units (Oda S et al., J. 

35 Biochem., 104:600-605,1988, the disclosure of which is incorporated herein by reference in its 
entirety). Experimental results suggest that M-ASGP-BP participates in the interaction between 
tumoricidal macrophages and tumor cells. 
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ASGPR is a membrane protein expressed specifically by hepatocytes. Its function is to 
uptake asialoglycoproteins in the serum for degradation in the liver. Partially deglycosylated 
plasma glycoproteins are efficiently and specifically removed from the circulation by a receptor- 
mediated process. In mammals, the ASGPR specific for desialylated (galactosyl-terminal) 
5 glycoproteins, is expressed exclusively in hepatic parenchymal cells. Following binding of the 
ligand to this cell surface receptor, the receptor-ligand complex is internalized and transported by a 
series of membrane vesicles and tubules to an acidic-sorting organelle where receptor and ligand 
dissociate (Spiess M et al., J. Biol. Chem., 260:1979-1982, 1985, the disclosure of which is 
incorporated herein by reference in its entirety). Reduction in expression of AGPR has been 

10 reported in response to such liver conditions as hepatic cirrhosis, liver cancer and regenerated liver 
(Stadalnik et al., J. Nucl. Med., 26:1233-1242, 1985, the disclosure of which is incorporated herein 

by reference in its entirety). It has also been reported, that ASGPR itself is present in serum 

(Katsugi et al., Alcohol Metabolism and Liver, 12:65-68, 1992, the disclosure of which is 
incorporated herein by reference in its entirety), which resulted in significant research being 

15 pursued toward the measurement of serum ASGPR. Furthermore, published results indicate that 
labeling compounds binding to ASGPR can be used as good indicators of liver function (Kudo, et 
al., Japan Assoc. of Gastrointest. Pathology., 89:1349-1359, 1992, the disclosure of which is 
incorporated herein by reference in its entirety). 

NK cells constitute the third major population of lymphocytes. They possess the inherent 

20 capacity to kill various tumors and virally infected cells and mediate the rejection of allografts. 
These properties allow NK cells to have a major role in the regulation of innate immune responses 
in particular, and immunological functions in general. Members of the NKR-P1 family are type-II 
transmembrane C-type lectin receptors found on the surface of NK cells and a subset of T 
lymphocytes (NK T cells). Further, a subset of NKR-P1 molecules has been identified at the 

25 surface of peripheral blood monocytes and dendritic cells (Poggi A et al., Eur. J. Immunol., 27: 
2965-2970, 1997, the disclosure of which is incorporated herein by reference in its entirety). 
Deficiencies in NKR-P1 + T cells, which preferentially accumulate in the liver and bone marrow, 
have been implicated in the susceptibility to many diseases including insulin-dependent diabetes 
mellitus (BDDM, Tori M et al. Transplantation, 70:32-38, 2000, the disclosure of which is 

30 incorporated herein by reference in its entirety) and multiple sclerosis (Poggi A et al., J. Immunol., 
162: 4349-4354, 1999, the disclosure of which is incorporated herein by reference in its entirety). 
NKR-P1 receptors have been shown to activate NK cell cytotoxicity coupled with release of 
interferon-y (IFN-y, Brown MG., Immunol. Rev., 155: 55-75, 1997, the disclosure of which is 
incorporated herein by reference in its entirety). However, unlike the well-characterized MHC class 

35 I ligands that regulate the specificity of the Ly-49 family of molecules, which are structurally 
related to the NKR-P1 receptors, cognate ligands for the NKR-P1 molecules have yet to be 
identified. Interestingly, it has been reported that a subset of the NKR-P1 molecules- NKR-P1B - 
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inhibits NK cell activation (Carlyle et al., J. Immunol., 162:5917-5923, 1999, the disclosure of 
which is incorporated herein by reference in its entirety). 

Based on the structural and chemical homologies the protein of SEQ ID NO:292 was 
characterized as a C-type lectin-like, type II membrane protein, whose ligand binding may be 
5 calcium dependent. The protein of SEQ ED NO:292 or fragments thereof may provide the basis for 
clinical diagnosis of diseases associated with its induction and/or repression. This protein, framents 
thereof or antagonists/inhibitors thereof may be useful in the diagnosis and treatment of tumors, 
viral infections, inflammation, or conditions associated with impaired immunity, organ 
transplantation, bacterial infections, autoimmunity, hepatic dysfunction and liver regeneration. 

10 Furthermore, the protein SEQ ID NO:292 or fragments thereof may be used as a reagent for 
analyzing the control of gene expression by lFNs and other cytokines such as IL-12 and IL-4, as 
well as growth and transcription factors, in normal and diseased cells. 

The protein of SEQ ID NO:292 has homology to the CTL domains of the ASPRG, M- 
ASGP-BP and NKR-P1 molecules. The protein of SEQ ID NO:292, in membrane-bound or soluble 

15 forms may have cytokine receptor activity, cell proliferation/differentiation activity, T cell 

activation activity, tissue growth regulating activity, receptor/1 igand activity, signal transduction 
activity, to promote transendothelial migration, anti-inflammatory activity, tumor inhibition activity, 
among others. Accordingly, the protein SEQ ID NO:292 or fragments thereof may be used in 
diagnosis and treatment of diseases such as, but not limited to, autoimmune disorders such as 

20 autoimmune hepatitis, rheumatoid arthritis, Graves disease, systemic lupus erythematosus, 

Wegener's granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, 
Sjogren's syndrome, inflammatory bowel disease, autoimmune encephalitis, myasthenia gravis 
keratitis, scleritis, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders including 
various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), sarcomas, 

25 melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell carcinomas of 
the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and bladder cancer, 
hematopoietic cancers,, head and neck cancers, and nervous system cancers, benign lesions such as 
papillomas, atherosclerosis, angiogenesis; viral infections, in particular HBV, HCV and HIV 
infections, as well as other viral- and pathogen-induced infections. The protein of SEQ ID NO:292 

30 or fragments thereof may also be used to treat conditions associated with inflammation or immune 
impairment (e. g. reumathoid and osteo arthritis and AIDS), allergy, hepatic cirrhosis and liver 
toxicity; as well as genetic disorders, chronic illnesses and infections associated with decrease in 
NK, NK T, moacrophage, monocyte and dendritic cell functions. In another embodiment of the 
invention, inhibitors of the protein of SEQ ID NO:292 may be used to treat conditions such as 

35 multiple sclerosis, EDMM, graft versus host disease (GVH) and transplanted organ rejection. 

Another embodiment relates to methods to treat and/or prevent the bacterial infections that 
arise in liver due to bacterial antigens brought from the intestine from the portal vein. In this 

188 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 PCT/1B00/01938 

embodiment, the protein of SEQ ID NO:292 may be used to counteract the effects of the bacterial 
endotoxin lipopolysaccharide (LPS). Another embodiment of the invention is the use of the protein 
of SEQ ID NO:292 or fragments thereof to inhibit of NK cells activated by bacterial superantigens 
or LPS, which would help treat vascular endothelial injury in conditions such as Kwasaki disease. 
5 The appearance of autoantibodies against the protein of SEQ ID NO:292 can be used as an 

indicator for autoimmune hepatitis (AIH), a disease that can lead to cirrhosis and fatal intractable 
hepatitis, as well as primary biliary cirrosis. The nucleic acid sequences encoding the protein of 
SEQ ID NO:292 or fragments thereof can be used for producing secreted forms of the protein. They 
can also be used to develop products for diagnosis and therapy. Accordingly, recombinant soluble 

10 derivatives can be used for detecting and measuring antibodies specific for the protein of the 

invention, e.g. by ELISA, Western blotting, etc. This allows AIH to be diagnosed and distinguished 

from other diseases. 

In another embodiment of the invention, the protein of SEQ ID NO:292 or fragments or 
derivatives thereof can also be used for the analysis and purification of asialoglycoproteins and to 

15 develop inhibiting agents against asialoglycoprotein incorporation, or viral and other protein 
invasion, into liver cells. 

Another embodiment of the present invention relates to polypeptides comprising the protein 
of SEQ ID NO:292 or fragments thereof and polynucleotides encoding the protein of SEQ ID 
NO:292 or fragments thereof. In another aspect the protein of SEQ ID NO:292 or fragments thereof 

20 may be used to identify specific molecules with which it binds such as agonists, antagonists or 
inhibitors. In a further aspect, the invention relates to methods for identifying agonists and 
antagonists/inhibitors of the protein of SEQ ID NO:292, and treating conditions associated with the 
protein of the invention or imbalance with the identified compounds. In a still further aspect, the 
invention relates to diagnostic assays for detecting diseases associated with inappropriate levels or 

25 activity of the protein of SEQ ID NO:292. Another embodiment of the invention relates to methods 
of measuring the amount of the protein of SEQ ID NO:292 in serum. Another embodiment relates 
to the use of labeling compounds that bind to the protein of SEQ ID NO:292 and can be used as 
good indicators of liver function or NK cell activity, among others. 

An embodiment of the present invention relates to methods of using the protein of the 

30 invention or part thereof to identify and/or quantify or other ligands, which may interact with the 
protein of SEQ ID NO:292. The protein of SEQ ID NO:292 or fragments thereof may be include in 
pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention (see above). In a preferred embodiment 
of the invention the protein of the invention or part thereof is used to modulate the effect of 

35 cytokines and related molecule such as IL-1, IL-2, IL-12, IFN- y. The protein of SEQ ID NO:292 
may also be used to correct defects in vivo models of disease such as autoimmune, inflammation, 
pathogen-mediated infection, liver toxicity, allograft rejection, GVH, as well as tumor models, by 
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injecting the protein either intraperitoneally, intravenously, subcutaneously or directly in the 
diseased tissue. 

The DNA encoding the protein of SEQ ED NO:292 or fragments thereof may be used in 
diagnostic assays for conditions/diseases associated with up-regulation or down-regulation of the 
5 expression of the protein of the invention (see above). The diagnostic assay is useful to distinguish 
between absence, presence, and excess expression of the protein and to monitor regulation of levels 
of the protein of during therapeutic intervention. The DNA may also be incorporated into effective 
eukaryotic expression vectors and directly targeted to a specific tissue, organ, or cell population for 
use in gene therapy to treat the above mentioned conditions, including tumors and/or to correct 

10 disease- or genetic-induced defects in any of the above mentioned proteins including the protein of 
the invention. The DNA may also be used to design antisense sequences and ribozymes, which can 
be administered to modify gene expression in NK, NK T, macrophages, monocytes and dendritic 
cells and to influence expression of cytokines such as IL-1, IL-2, IL-4, IL-12, and IFN-y. In vivo 
delivery of genetic constructs into subjects can be developed to the point of targeting specific cell 

15 types, such as tumor where expression of the protein of SEQ ID NO:292 may be affected or is 
modulating the expression and/or activity of other proteins such as cytokines, growth factors, their 
receptors and/or tumor antigens. The DNA may also be used to identify unknown upstream 
sequences (e. g. promoters and regulatory elements) by standard techniques and for research into 
the control of gene expression by IFNs and other cytokines, as well as growth and transcription 

20 factors in normal and diseased cells. Hybridization probes are useful to detect DNA encoding the 
protein of SEQ ID NO;292 (or closely related molecules) in biological samples, and for mapping 
the naturally occurring genomic sequence to a particular chromosome/chromosome region. The 
DNA may be used to generate and/or treat in vivo animal models of disease, including susceptibility 
or resistance to infection, tumors, autoimmune conditions, GVH, allograft rejection and liver 

25 toxicity, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO:292 are useful for the diagnosis of conditions 
and disease associated with its expression and to quantify the protein of the invention (e. g. in 
assays to monitor patients during therapeutic intervention). Antibodies specific for the protein may 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab fragments 

30 produced by a Fab expression library. Neutralizing antibodies are especially preferred for 

diagnostics and therapeutics. Diagnostic assays for the protein of the invention include methods 
utilizing the antibody and a label to detect the protein of SEQ ID NO:292 in human body fluids or 
extracts of cells or tissues as well as methods for detecting or measuring antibodies against the 
protein of SEQ ID NO:292. 

35 The protein of SEQ ID NO:292 and its catalytic or immunogenic fragments or oligopeptides 

thereof, can be used for screening therapeutic compounds in any variety of drug screening 
techniques including high throughput. Methods which may be used to quantitate the expression of 
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the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
(PCR), RT-PCR, RNAse protection, Northern blotting, enzyme-linked immunosorbent asay 
(ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), immunoprecipitation 
, and chromatography. 

5 Accordingly, the protein of SEQ ED NO:292 or fragments thereof may be used to purify or 

enrich proteins containing carbohydrates. In such embodiments, the lectin of the present invention 
is placed in contact with carbohydrate -containing proteins under conditions which facilitate specific 
binding. The lectin of the present invention may be fixed to a solid support. After binding, 
specifically bound proteins are dissociated using appropriate salt or other conditions. 

1 0 The protein of SEQ ED NO:292 or fragments thereof may also be used to regulate any of the 

activities described above, including the interaction between tumoricidal macrophages and tumor 

eells r the activity-of-NK cells^ 

brought from the intestine, or to counteract the effects of bacterial LPS. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:292 , 

15 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be any of those described 
above or an abnormality in any of the functions listed above. In such embodiments, the protein of 
SEQ ED NO:292, or a fragment thereof, is administered to an individual in whom it is desired to 

20 increase or decrease any of the activities of the protein of SEQ ID NO:292. The protein of SEQ ID 
NO:292 or fragment thereof may be administered directly to the individual or, alternatively, a 
nucleic acid encoding the protein of SEQ ID NO:292 or a fragment thereof may be administered to 
the individual. Alternatively, an agent which increases the activity of the protein of SEQ ID 
NO:292 may be administered to the individual. Such agents may be identified by contacting the 

25 protein of SEQ ID NO:292 or a cell or preparation containing the protein of SEQ ID NO:292 with a 
test agent and assaying whether the test agent increases the activity of the protein. For example, the 
test agent may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:292 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 

30 with the activity of the protein of SEQ ED NO:292 may be identified by contacting the protein of 
SEQ ID NO:292 or a cell or preparation containing the protein of SEQ ID NO:292 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

35 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, fetal liver or fetal kidney, or to distinguish between two or more possible sources of a 
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sample on the basis of the level of the protein of SEQ ID NO:292 in the sample. For example, the 
protein of SEQ ID NO:292 or fragments thereof may be used to generate antibodies using any 
techniques known to those skilled in the art, including those described therein. Such antibodies may 
then be used to identify tissues of unknown origin, for example, forensic samples, differentiated 
5 tumor tissue that has metastasized to foreign bodily sites, or to differentiate different tissue types in 
a tissue cross-section using immunochemistry. In such methods a sample is contacted with the 
antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from fetal liver or fetal kidney or tissues other than fetal liver or fetal kidney to 

10 determine whether the test sample is from fetal liver or fetal kidney. Alternatively, the level of the 
protein of SEQ ID NO:292 in a test sample may be measured by determining the level of RNA 
encoding the protein of SEQ ID NO:292 in the test sample. RNA levels may be measured using 
nucleic acid arrays or using techniques such as in situ hybridization, Northern blots, dot blots or 
other technques familiar to those skilled in the art. If desired, an amplification reaction, such as a 

15 PCR reaction, may be performed on the nucleic acid sample prior to analysis. The level of RNA in 
the test sample is compared to RNA levels in control cells from fetal liver or fetal kidney or tissues 
other than fetal liver or fetal kidney to determine whether the test sample is from fetal liver or fetal 
kidney. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
20 used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:292, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:292 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:292 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
25 support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:292 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:292. In such techniques, the level of the protein of SEQ ID NO:292 in 
30 an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO:292 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ED NO: 292 which is associated 
with disease. 

Protein of SEP ID NO: 408 (internal designation 179-14-2-0-F1 1-CS 
35 The 236 amino acid protein of SEQ ED NO: 409, herein referred to as PNMT A, and 

encoded by the cDNA of SEQ ED NO: 168 is found in fetal kidney and fetal brain. PNMT A is a 
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polymorphic variant of human phosphotidylethanolamine N-methyltransferase (PNMT) 
(SPTREMBLNEW SPTREMBL SWISSPROT accession number Q9UHY6). PNMT A differs 
from the sequence of PNMT (STR accession number Q9UHY6) by two amino residues. Position 95 
contains an isoleucine residue (I) substituted for a valine residue (V); position 130 contains a valine 
5 residue (V) substituted for a glutamine residue (G). PNMT A displays 4 candidate membrane- 
spanning segments in positions 50 to 70, 83 to 103, 131 to 151 and 196 to 216. 

Catecholamine neurotransmitters [e.g., dopamine, noradrenaline (norepinephrine), 
adrenaline (epinephrine)] are synthesized in catecholaminergic neurons from tyrosine, via dopa, 
dopamine and noradrenaline, to adrenaline. Four enzymes are involved in the biosynthesis of 

10 adrenaline: (1) tyrosine 3-mono-oxygenase (tyrosine hydroxylase, TH); (2) aromatic L-amino acid 

decarboxylase (AADC, or Dopa decarboxylase, DCC); (3) dopamine beta-mono-oxygenase 
— - (dopamine beta-hydroxylase,-DBH);_and_(4) norajirenahji^ 

(phenylethanolamine N-methyltransferase, PNMT)(Nagatsu, Neurosci. Res. 12:315-345 (1991)). 
PNMT, the final enzyme in the pathway for adrenaline biosynthesis catalyses the production of 

15 adrenaline from noradrenaline using S-adenosyl-L-methionine as a methyl donor. For this reason, 
PNMT serves as a good marker for tissues and cells producing epinephrine (adrenaline). Studies 
conducted by Kennedy and collaborators have shown that PNMT are widely distributed in human 
tissues including heart and kidney (Kennedy et al., J. Clin. Invest. 95:2896-2902 (1995)). 

In some pheochromocytomas, the tumors contain and secrete greater amounts of adrenaline 

20 than do normal adrenal medullas. In a case/control study, Isobe et al have shown that adrenaline- 
secreting pheochromocytomas express significantly greater amounts of PNMT mRNA than do 
normal adrenal medullas (Isobe et al J. Urol. 163:357-362 (2000)). Moreover, PNMT 
immunoreactivity is only detected in the adrenaline-secreting tumors. The C-l region in the rostral 
ventral lateral medulla contains mainly adrenaline neurons. These neurons are the tonic vasomotor 

25 center of the brain. Burke et al have demonstrated changes in the enzymatic activity of PNMT in 
axon terminals and cell bodies of neurons from the medulla of patients with Alzheimer's disease. 
They have also shown that PNMT protein is decreased in axon terminals in brains from patients 
with Alzheimer's disease; the decrease in PNMT appears to be due to retrograde degeneration of 
epinephrine neurons (Burke et al, Ann. Neurol. 22:278-280 (1987)). In the case of advanced 

30 Alzheimer's disease, the Burke et al. presented evidence that the accumulation of PNMT in the 
perikarya results from diminished transport of this enzyme to axon terminals (Burke et a/., J. Am. 
Geriatr. Soc. 38:1275-1282 (1990)). 

Neurons that contain PNMT have cell bodies in brain stem regions of the rat brain and send 
projections mainly into other brain stem areas, such as the hypothalamus and the spinal cord. These 

35 neurons can be affected pharmacologically by various kinds of drugs. PNMT inhibitors currently 
represent the only means of modifying adrenaline neurons pharmacologically without affecting 
noradrenaline or dopamine neurons in brain. Experiments conducted in deoxycorticosterone 
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acetate-salt (DOCA-salt) hypertensive rats and spontaneously hypertensive rats (SHR) have shown 
that inhibitors of PNMT lower blood pressure (Goldstein et al. 9 Life Sci. 30:1951-1957 (1982); 
Lyang et al. y Res. Commun. Chem. Pathol. Pharmacol. 46:319-329 (1984); Chatelain et al. y J. 
Pharmacol. Exp. Ther. 252:1 17-125 (1990)). Molecules and compounds affecting adrenaline 
5 neurons may also be of use in the treatment of psychiatric disorders and neuroendocrine 
dysfunction. 

One embodiment of the subject invention provides polypeptides comprising the sequence of 
PNMT A. Other polypeptides of the invention include polypeptides comprising the amino acids of 
SEQ ID NO: 409 from positions 50 to 70, 83 to 103, 131 to 151 and/or 196 to 216. Also 

10 encompassed by the instant invention are biologically active fragments of the PNMT A protein. 
"Biologically active fragments" are defined as those peptide or polypeptide fragments of PNMT A 
which have at least one of the biological functions of the PNMT A protein (e.g., the ability to 
catalyze the formation of adrenaline). In a preferred embodiment, the biologically active fragment 
of PNMT A contains at least one of the amino acid substitutions which distinguish PNMT A from 

15 PNMT (i.e., an isoleucine residue (I) substituted for a valine residue (V) at position 95; and/or 

valine residue (V) substituted for a glutamine residue (G) at position 130). In one embodiment, the 
PNMT A polypeptides of the invention are encoded by clone 179-1462-0-F1 1-CS. 

Thus, one embodiment of the invention provides an enzymatic component of the adrenaline 
synthetic pathway and methods of producing adrenaline in accordance with methods known to those 

20 skilled in the art. These methods substitute PNMT A, or biologically active fragments thereof, for 
the PNMT enzyme used in these known synthetic pathways. 

The invention also provides variants of the protein of SEQ ED NO: 409. These variants 
have at least about 80%, more preferably at least about 90%, and most preferably at least about 95% 
amino acid sequence identity to the amino acid sequence of PMNT A. Variants according to the 

25 subject invention also have at least one functional or structural characteristic of PNMT A, such as 
the ability to catalyze the formation of adrenaline. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein may be 
practiced utilizing PNMT A or variants thereof. Likewise, the methods of the subject invention 
may be practiced using biologically fragments of PNMT A, or variants thereof, provides that said 

30 biologically active fragments contain the amino acid substitutions noted supra. 

One embodiment of the subject invention provides methods of using the protein of the 
invention, or biologically active fragments thereof, to label (chemically or isotopically) the 
adrenaline molecule in vitro. The labeled adrenaline molecules can then be used to localize 
receptors in tissue cuts by in situ hybridization experiments. 

35 The invention also provides a fusion protein or polypeptide in which PNMT A, or 

biologically active fragments thereof, are combined with another protein (tag) by the use of a 
recombinant DNA molecule. The resulting purified, and enzymatically active fusion product, is then 
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added, in vitro, to the noradrenaline precursor and to S-adenosyl-L-methionine as a methyl donor. 
The enzymatic reaction is then performed in conditions known to those skilled in the art (Burke et 
aL, Proc. Soc. Exp. Biol. Med. 181:66-70 (1986); Morimoto et aL, Endocr. J. 40:179-183 (1993)). 
In this reaction, the methyl group of S-adenosyl-L-methionine must be labeled isotopically ([14C]- 
5 S-adenosyl-L-methionine or [methyl-3H]-S-adenosylmethionine), or chemically, in order to allow 
the transfer of a "tagged" methyl group to the adrenaline molecule. 

Similarly, in cells transfected with cDNAs encoding the protein of the invention PNMT 
activity of expressed proteins may be measured by incubating cytosolic fractions with [14C]-S- 
adenosyl-L-methionine and normetanephrine for 60 min according to methods described by those 
10 skilled in the art (Morimoto et aL, ibid.). Agonists and/or antagonists of PNMT activity may also be 
tested (high throughput screening) on transfected cells expressing the wild type form of the protein 

of the invention Again r effects of such drugs on PNMT enzymatic.activity_is .measured by the 

methods described above. 

The invention further relates to methods and compositions used to modify the protein of the 
15 invention (i.e. derivatize the PNMT A protein). Post-translational modifications encompassed by 
the invention include, N-linked or O-linked carbohydrate chains, processing of N-terminal or 
C-terminal ends, attachment of chemical moieties, such as polyethylene glycol, to the amino acid 
backbone, chemical modifications of N-linked or O-linked carbohydrate chains, and addition or 
deletion of an N-terminal methionine residue as a result of prokaryotic host cell expression. Some 
20 of these modifications of the protein of the invention may facilitate its extraction and purification in 
prokaryotic expression systems. Post-translational modifications such as N-linked or O-linked 
carbohydrate chains addition may also optimize the enzymatic activity of the protein of the 
invention when it is first produced in a prokaryotic system. 

Another embodiment of the subject invention provides antibodies directed against the 
25 protein of the invention or immunogenic fragments thereof. The antibodies of the invention are 
useful for the screening of tissues and cells producing adrenaline or for affinity purification of 
PNMT or PNMT A. These antibodies may also be used in the diagnosis of pathologies and 
disorders such as pheochromocytomas and Alzheimer's disease, where PNMT A is overexpressed. 
Methods of peforming affinity purification as well as methods of making polyclonal and 
30 monoclonal antibodies are well known to those skilled in the art. 

In therapeutic regimens, neutralizing antibodies may be used as antagonists of PNMT A and 
used to treat conditions associated with overexpression of PNMT A. These disorders include, and 
are not limited to, hypertension, pheochromocytomas, and advanced Alzheimer's disease (Goldstein 
et aL, Life Sci. 30:1951-1957 (1982); Lyang et aL, Res. Commun. Chem. Pathol. Pharmacol. 
35 46:319-329 (1984); Chatelain et aL, J. Pharmacol. Exp. Ther. 252:1 17-125 (1990); Isobe et al., J. 
Urol. 163:357-362 (2000); Burke et aL, J. Am. Geriatr. Soc. 38:1275-1282 (1990)). 
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Proteins of SEP ID NOs: 395 and 403 (internal designation: 160-101 -3-0-H2-CS and 160-99-4-0- 
B4-CS respectively 

The 367-amino-acid-long proteins of SEQ ID NOs: 395 and 403 encoded by the cDNAs of 
SEQ ED NOs: 154 and 162 respectively are polymorphic variants, the first one being overexpressed 
5 in fetal brain and ovary and the second one in fetal brain only. They both contain glutathine S- 
transferase (GST) domains from positions 47 to 122, and 206 to 309 which are respectively the G- 
site and H-site described below. In addition, they also display two hydrophobic domains (from aa 
258 to aa 278 and from aa 338 to aa 358) which are characteristic of some GST proteins. 

Glutathione S-transferase proteins (GSTs) are dimeric proteins that catalyse the conjugation 

10 of glutathione to a wide range of hydrophobic compounds (through the formation of a thioether 
bond with their electrophilic centre) to create the products which are less reactive, more 
hydrophilic, and thus more easily excreted from the cells. The GST superfamily (E.C. 2.5.1.18) is 
indeed believed to be one of the most important proteins in the detoxification of reactive 
electrophiles within living cells. Glutathione is a cellular tripeptide (gamma- 

15 glutamylcysteinylglycine) which is perhaps the most abundant amino acid derivative contained in 
the cells of higher life forms. The middle amino acid in glutathione, cysteine, has a free thiol group 
which can compete with the nucleophilic site on nucleotide bases for reaction with electrophiles. 
Within the cell, glutathione functions so as to conjugate to xenobiotic toxic molecules in general, 
and electrophiles in particular, to render the toxic molecules less reactive against cellular 

20 macromolecules and to target the toxic molecules for subsequent metabolic and excretion pathways. 
Based on amino acid sequence identity, there are at least seven major classes of GST 
proteins (designated alpha, kappa, mu, pi, sigma, theta and zeta). Sequence similarity between 
classes is rather low, ranging between 20-30%. However, a single point mutation in the H-subsite 
region of GST is enough to shift substrate specificity from class pi to alpha (Nuccetelli M.N. et al. 

25 Biochem.Biophys.Res.Commun. 252: 184-189 (1998)). In spite of relatively low sequence identity, 
the GSTs exhibit a high degree of structural similarity. It is generally known that the GST molecule 
binds quite specifically and with high affinity to glutathione, but binds promiscuously to a wide 
variety of xenobiotic, electrophilic, and alkylating chemical agents. All GST enzymes of the four 
main cytosolic classes is found in dimeric form with two active sites per dimer each of which 

30 functions independently of the other. The active site has been characterized as consisting of a 

glutathione binding region (designated the G-site) and a non-specific hydrophobic binding region 
(designated the H-site) to accommodate the electrophilic substances. Pi-, mu-, alpha- and theta-class 
crystal structures have been elucidated; all possess a similar GSH-binding site, but the hydrophobic 
substrate-binding site (H-subsite) is subject to variation across the classes (Allardyce C.S. et al. 

35 Biochem.J. 343 525-53 1 (1999)). The GST activity has been suggested be involved in the regulation 
of the assembly of multisubunit complexes by shifting the balance between glutathione, disulfide 
glutathione, thiol groups of cysteines, and protein disulfide bonds. The GST domain is a 
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widespread, conserved enzymatic module that may be covalently or noncovalently complexed with 
other proteins. Regulation of protein assembly and folding may be one of the functions of GST 
(Koonin EV et al. Protein Sci 3:2045-2054 (1994)). 

The cytosolic glutathione S-transferase are known to belong to four classes, designated 
5 Alpha, Mu, Pi and Theta. A fifth class of glutathione S-transferases is a microsomal enzyme found 
primarily in liver endoplasmic reticulum. An extensive analysis of the expression microsomal 
glutathione transferase 1 in human tissues shows that it predominantly occurs in liver and pancreas. 
The relative expression levels in man ranged from: liver and pancreas to kidney, prostate, colon (30- 
40%), heart, brain, lung, testis, ovary, small intestine (10-20%), placenta, skeletal muscle, spleen, 

10 thymus and peripheral blood leucocytes (1-10%). Liver-enriched expression was detected in human 
fetal tissues with lung and kidney displaying lower levels (10-20%). No transcripts could be 
detected in fetal brain or heart (Estonius M et al. Eur J Biochem 260:409-13 (1999)). Based on these 
observations, and the fact that the enzyme is encoded by a highly conserved single-copy gene, it is 
suggested that microsomal glutathione transferase 1 performs essential functions vital to most 

15 mammalian cell types. One particular glutathione S-transferase was still identified in mitochondrial 
matrix (Pemble S.E. et al. Biochem.J. 319 : 749-754 (1996)). 

GST and GST-like proteins are largely spread in organisms. In vertebrates and in 
cephalopodes some proteins (christallins) presented in the lenses are structurally related to alpha- 
class GSTs (Chiou S.H. et al. Biochem.J. 309 : 793-800 (1995)). Furthermore, the olfactory 

20 epithelial cytosol shows the highest GST activity among the extrahepatic tissues. The olfactory 
GSTs were found to catalyse glutathione conjugation of several odorant classes, including many 
unsaturated aldehydes and ketones, as well as epoxides and were proposed to play an important role 
in chemoreception (Ben-Arie N. et al. Biochem.J. 292 : 379-384 (1993)). 

Higher cells each contain a family of many GST isozymes in each class with broad, yet 

25 overlapping, specificity. Mu-class GSTs are thought to be involved in the detoxification of reactive 
oxygen species (cyclised o-quinones) produced via oxidative metabolism of catecholamines. These 
toxins are thought to be involved in neurological disorders of the nigrostriatal and mesolimbic 
systems (Parkinsons and Schizophrenia, respectively). Enzymes of the mu-class GSTs are expressed 
in the substantia nigra and have preferential substrate specificity for the cyclised o-quinones formed 

30 by catecholamine metabolism (Hansson L.O.et al. J.Mol.Biol. 287: 265-276 (1 999), Takahashi Y.et 
al. J. Biol. Chem.268: 8893-8 (1993)). Whilst most of the GSTs share common substrates, there are 
distinct differences in substrate preference between subfamilies. These enzymes have evolved as a 
cellular protection system against a wide variety of electrophilic compounds, including a range of 
xenobiotics, oxidative metabolism by-products (oxidized lipid, DNA and catechols), and in 

35 particular are known to metabolise a number of environmental carcinogens. 

GSTs are also known to catalyze other reactions, such as peroxidase and isomerase 
reactions (Edwards R. et al. Trends in Plant Sci. 5 : 193-198 (2000)) as well as the addition of 
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aliphatic epoxides and arene oxides to glutathione; the reduction of polyol nitrate by glutathione to 
polyol and nitrite; certain isomerization reactions and disulfide interchange. As well, there are 
marked species differences in catalytic activities between various purified mammalian hepatic GST 
mixtures. Some of them catalyse chemical stereospecific conversion of several pharmacological 
5 substances much less effective then anothers. For exemple, recombinant human GST was 

succesfully used in the reaction of steric conversion of 13-cis-retinoic acid to all-trans-retinoic acid 
(Chen H. and Juchau M.R. Biochem.J. 336 : 223-226 (1998)). 

An increasing number of GST genes are being recognized as polymorphic. Certain alleles, 
particularly those that confer impaired catalytic activity may be associated with increased sensitivity 

10 to toxic compounds. Genetic polymorphisms and differences in GST expression have been 

implicated in individual susceptibility to certain types of cancer (for rev. Hayes JD and Strange RC 
Pharmacology 61 : 154-166 (2000). For exemple, GSTM1 deficiency predisposes to head and neck 
cancer, especially to cancer of the larynx, which is particularly exposed to tobacco smoke 
carcinogens (Gronau S et al. Laryngorhinootologie 79:341-344 (2000)). Conversely, over- 

15 expression of GSTs is thought to be involved in the phenomenon of multi-drug resistance to cancer 
chemotherapy. One of the class of electrophilic compounds that are substrates for the glutathione S- 
transferase enzymes is the group of alkylating agents used in antineoplastic therapy. A common 
problem that is observed in modern cancer chemotherapy is the appearance of chemotherapeutic 
resistant tumor cells that, because of the resistivity, no longer respond appropriately to the 

20 antineoplastic agents. This resistance is often observed with many drugs that have no physical or 
mechanistic similarities to the original agent. GST isoenzymes have been shown to be involved in 
the development of drug resistance to a variety of chemotherapeutic agents such as adriamycin, 
vinblastine, actinomycin D and colchicine (Beckett, et al. Adv. Clin. Chem. 30:281-380 (1993)). It 
has been demonstrated that a resistant population of malignant cells shows a modified pattern of 

25 total glutathione S-transferase activity. A resistant population of MCF-7 breast cancer cells, 

identified through selection in adriamycin by Batist et al., J. Biol. Chem., 261:15544-15549 (1986) 
resulted in a subset of cells which were approximately 200 fold more resistant than the parental 
cells. The resistant cells were found to exhibit a 45 fold increase in total glutathione S-transferase 
activity, the increase being due to the result of an appearance of an isozyme not expressed in the 

30 parental cell line. It was demonstrated that an increase in glutathione S-transferase alone, an 
increase conditioned by the transformation of susceptible cells with a foreign DNA construct 
expressing the wild-type glutathione S-transferase coding region, could increase the resistance of 
cells to an antineoplastic agent. As reported in Puchalski and Fahl, Proc. Natl. Acad. Sci. USA, 
87:2443-2447 (1990), expression of the rat 1-1, 3-3 and the human Pl-1 isozymes of glutathione S- 

35 transferase in COS cells increased their resistance to the agent. The recent study of increase in 
resistance of tumor cells to cytotoxic drugs or ionizing radiation has allowed to identify using 
differential display a new GST-related protein p28 expressed exclusively in lymphoma cell (Kodym 
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R. et al. J. Biol.Chem. 274: 51 1-5137 (1999)). Subcellular protein fractionation revealed p28 
localization in the cytoplasm, but with thermal stress p28 relocated to the nuclear fraction of cellular 
proteins. The sequence homology and the similar functional characteristics of p28 to other GST 
family members (in particular relocalization in response to thermal stress and ability to bind 
5 glutathione), argues that p28 is a new mammalian member of GST superfamily. 

Evidence suggests that the level of expression of GSTs is a crucial factor in determining the 
sensitivity of cells to a broad spectrum of toxic chemicals. In humans, marked interindividual 
differences exist in the expression of class alpha, mu and theta GST. For the most abundant 
mammalian classes of GST the mechanisms of transcriptional and post-translational regulation have 

.10 been studied. The biological control of alpha-, mu- and pi- classes exhibit sex-, age-, tissue-, 

species-, and tumor-specific patterns of expression. In addition, GST are regulated by a structurally 
diverse range of xenobiotics and, to date, more then 100 chemicals have been identified that induce 
GST (Hayes J.D. and Pulford D.J. Crit.Rev.Biochem.Mol.Biol. 30 : 445-600 (1995)). A significant 
number of these chemicals occur naturally and, as they are found as nonnutrient components in 

15 vegetables and citrus fruits, it is apparent that humans are likely to be exposed regularly to such 
compounds. Many inducers effect transcriptional activation of GST genes through either the 
antioxidant-responsive element (ARE), the xenobiotic element (XRE), the GST P enhancer 1 
(GPE), or the glucocorticoid-responsive element (GRE). Many of compounds that induce GST are 
themselves substrates for these enzymes, or are metabolized (by cytochrome P-450 

20 monooxygenases) to compounds that can serve as GST substrates, suggesting that GST induction 
represent part of an adaptive response mechanism to chemical stress by electrophiles. It also appear 
probable that GST are regulated in vivo by reactive oxygen species, the potents inducers capable of 
generating free radicals by redox-cycling ; such relulation can be an adaptive response to oxydative 
stress in the cell. It has been shown GST-pi can potently and selectively inhibit activation of jun 

25 protein by its upstream kinase (JNK) ; these results suggest GST-pi can also be a regulator of signal 
transduction (Monaco R. et al. J.Prot.Chem. 18 : 859-866 (1999)). The majority of human tumors 
express significant amounts of class pi GST (Hayes&Pulford, supra). 

Therefore, GSTs have medical importance due to their role in mediating drug resistance in 
cancer patients. The measurement of GST isoenzymes in vitro has importance in diagnostic 

30 medicine. For example, the measurement of the pi isoenzyme of GST in tissue specimens is useful 
in pathology for the detection and diagnosis of a variety of different tumors. In addition, 
measurement of the alpha form of GST in blood is useful for the detection arid monitoring of a 
variety of different forms of liver disease (for a detailed description of the clinical applications of 
GST measurements see Beckett, et al., supra). 

35 It is believed that the proteins of SEQ ED NOs: 395 and 403 or part thereof are transferases, 

probably transferring alkyl or acyl groups different from methyl group, more probably glutathione 
S-transferases and, as such, play a role in cellular detoxification especially against xenobiotics and 
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oxidative metabolism byproducts. Preferred polypeptides of the invention are polypeptides 
comprising the amino acids of SEQ BDNOs: 395 and 403 from positions 47 to 122, and 260 to 309. 
Other preferred polypeptides of the invention are fragments of SEQ ID NOs: 395 and 403 having 
any of the biological activities described herein. The transferase activity of the proteins of the 
5 invention or part thereof may be assayed using any of the assays known to those skilled in the art 
including those described for GST proteins as in US patents 5,866,792 and 6,096,504, which 
disclosures are hereby incorporated by reference in their entireties. 

To find substrates, the proteins of the invention, or part thereof, or derivative thereof, may 
be used for screening libraries of compounds in any of a variety of drug screening techniques. The 

10 fragment employed in such screening may be free in solution, affixed to a solid support, borne on a 
cell surface, or located intracellularly. The formation of binding complexes, between the proteins of 
the invention, or part thereof, or derivative thereof, and the agent being tested, may be measured. 
Antagonists or inhibitors of the proteins of the invention may be produced using methods which are 
generally known in the art, including the screening of libraries of pharmaceutical agents to identify 

15 those which specifically bind the protein of the invention. Another technique for drug screening 
which may be used provides for high throughput screening of compounds having suitable binding 
affinity to the proteins of the invention as described in published PCT application WO84/03564. 

The invention relates to methods and compositions using the proteins of the invention or 
part thereof or derivative thereof to catalyze GST-dependent detoxification reactions in vitro or in 

20 vivo using any methods known to those skilled in the art. For example, uses of the proteins of the 
invention or part thereof may be very useful to treat toxic byproducts such as the ones obtained in 
laboratory experiments, such as dietary toxins due to the use of pesticides on plants used to feed 
animals or humans, etc... Preferably, the proteins of the invention or part thereof or derivative 
thereof is added to a sample containing the substrate(s) in conditions allowing detoxyfication, and 

25 allowed to catalyze the detoxification of the substrate(s). In a preferred embodiment, the 
detoxification is carried out using a standard assay such as those described herein. 

In some of the above cited embodiments, compositions comprising the proteins of the 
present invention or part thereof are added to samples as a "cocktail" with other detoxifying 
enzymes. The advantage of using a cocktail of detoxifying enzymes is that one is able to detoxify a 

30 wide range of substrates without knowing the specificity of any of the enzymes. Using a cocktail of 
detoxifying enzymes also protects a sample from a wide range of future unknown toxic compounds 
from a vast number of sources. For example, the proteins of the invention or part thereof is added 
to samples where toxic compounds are undesirable. Alternatively, the protein of the invention or 
part thereof may be bound to a chromatographic support, either alone or in combination with other 

35 detoxifying enzymes, using techniques well known in the art, to form an affinity chromatography 
column. A sample containing the undesirable substrate is run through the column to remove the 
substrate. Immobilizing the proteins of the invention or part thereof on a support is particularly 
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advantageous for those embodiments in which the method is to be practiced on a commercial scale. 
This immobilization facilitates the removal of the enzyme from the batch of product and subsequent 
reuse of the enzyme. Immobilization of the protein of the invention or part thereof can be 
accomplished, for example, by inserting a cellulose-binding domain in the protein. One of skill in 
5 the art will understand that other methods of immobilization could also be used and are described in 
the available literature. Alternatively, the same methods may be used to identify new substrates. 

In a preferred embodiment, the invention relates to cells and plants or animals genetically 
engineered to express the protein of the invention or part thereof, preferably at a high level using 
any method known to those skilled in the art. Such engineered cells, animals or plants will display 

10 enhances detoxification of compounds. In a more preferred embodiment, expression of the proteins 
of the invention or part thereof will confer resistance to herbicides to transgenic plants using 
techniques similar to those described in the US patent 5,866,792. 

For such embodiments, the proteins of the invention may need to be modified to enhance 
their ability to react with specific substrates. These modifications can provide novel isoforms which 

15 are specifically efficient against selected electrophilic or alkylating agents. Artificial DNA 
constructs encoding and expressing such modified or mutant proteins of the invention or part 
thereof may be selectively delivered into targeted cells to enhance the resistivity of those cells to the 
alkylating or neoplastic agents. The methods related to such modifications are described (Fahl , et 
al. United States Patent 6,136,605 Oct24, 2000). The method is based on random mutation and 

20 selection with the selection being performed with the agent against which enhanced activity is 

sought. The mutation is preferably site directed to the amino acids associated with the H-site on the 
enzyme, so as to favor the creation of new, useful isoforms of the enzyme. 

In another embodiment, the invention relates to compositions and methods using the 
proteins of the invention or part thereof to design specific systems of artificial chemoreception as 

25 described in Ben-Arie N. et al, supra. Such chemoreception systems could recognize odorants, 
xenobiotics, pesticides, drugs and may be useful for chemical, cosmetic, pharmaceutical, forensic 
and any other analytical purposes. The design of such a system may be generally based on the 
subtle specificity of recognition of compounds by different GST isoenzymes. The methods to 
produce analytical diagnostics based on enzyme specificity are known by those skilled in the art. 

30 In another embodiment, the invention relates to compositions and methods using the 

proteins of the invention or part thereof such as ligands for substrates of interest. In a preferred 
embodiment, the proteins of the invnetion or part thereof may be used to identify and/or quantify 
substrates using any techniques known to those skilled in the art such as those described in Koonin 
E., supra. In another preferred embodiment, the proteins of the invention or part thereof may be 

35 used to improve or to modify some molecular biology methods based on protein-protein 

interactions, including but not limited to two-hybrid assays, expression and purification systems 
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based on GST fusion to heterologous proteins as already available commercially (expression 
vectors or plasmids encoding fusions proteins and affinity purification methods). 

In still another embodiment, the invention relates to methods and compositions using the 
proteins of the invention or part thereof as a marker protein to selectively identify tissues, preferably 
5 fetal brain for the protein of SEQ ID NO: 403, and preferably fetal brain and ovary for the protein 
of SEQ ID NO: 395. For example, the protein of the invention or part may be used to synthesize 
specific antibodies using any techniques known to those skilled in the art including those described 
therein. Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for 
example, forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, 

10 or to differentiate different tissue types in a tissue cross-section using immunochemistry. 

In another embodiment of the invention, measurement of the activity or expression of the 
proteins of the invention, may be used for the assessment of organ status, including organ damage 
following immunological or toxological insult and diagnostic of transplant rejection, using any 
technique known to those skilled in the art including those described in US patents 6,080,551 and 

15 RE35,419. 

In another embodiment of the invention, the proteins of the invention or part thereof, or 
derivative thereof, may be used to diagnose, treat and/or prevent cell proliferative disorders linked 
to dysregulation of gene expression of the proteins of the invention. Such disorders include but are 
not limited to, benign tumors, and cancers such as adenocarcinoma; leukemia; melanoma; 

20 lymphoma; sarcoma; and cancers of the brain, ovary, bladder, colon, liver, small intestine, large 
intestine, breast, kidney, lung, and prostate. Diagnosis may be performed using nucleic acids or 
antibodies able to detect the expression of the protein of the invention using any technique known to 
those skilled in the art including Northern blotting, RT-PCR, immunoblotting methods 
immunohistochemisty, enzyme-linked immunosorbant assay (ELISA) described herein. Quantities 

25 of the protein of the invention expressed in subject samples, control and disease from biopsied 
tissues or body fluids or cell extracts taken from patients are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing disease. 

For prevention and/or treatment purposes, the expression of the proteins of the invention 
may be enhanced using any methods known to those skilled in the art. For example, gene therapy 

30 techniques may be used such as the delivery of sense promoter polynucleotide constructs for the 
proteins of the invention or part thereof using a recombinant expression vector such as a chimeric 
virus or a colloidal dispersion system (see Nelson et al. United States Patent 552,277). 
Alternatively, the proteins of the invention or fragments thereof or derivatives thereof may be 
administered to a subject to treat or prevent cancerous and precancerous disorders as well as a 

35 proliferative disorders in general. Such disorders can include, but are not limited to, syndromes 
represented by abnormal neoplastic, including dysplastic, changes of tissue, dysplastic growths in 
ovary, brain, colonic, breast, prostate or lung tissues, dysplastic nevus syndromes, polyposis 
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syndromes, colonic polyps, precancerous lesions of the cervix (i.e., cervical dysplasia), esophagus, 
lung, prostatic dysplasia, prostatic intraneoplasia, breast and/or skin and related conditions (e.g., 
actinic keratosis), whether the lesions are clinically identifiable or not. 

The invention also relates to compositions and methods using the proteins of the invention 
5 or part thereof or derivative thereof to decrease drug resistance in cancer chemotherapy. Inhibition 
of the expression and/or activity of the proteins of the invnetion may be achieved using any mean 
known to those skilled in the art. In a preferred embodiment,gene therapy methods such as 
antisense oligonucleotides, triple helices strategies are described elsewhere in the application. In 
another preferred embodiment, antagonists of the activity of the proteins of the invention may be 

10 used. These antagonists may be directly administered to patients. Low-molecular-weight inhibitors 
(i.e., those which can be delivered freely into the brain and which specifically inhibit GST activity) 
are especially preferred for use in cancer therapy. Alternatively, artificial DNA constructs encoding 
peptide modulators or inhibitors of the activity of the protein of the invention and flanking 
sequences effective to express the protein coding sequence in a host cell as well as flanking 

15 regulatory sequences (such as an antioxidant responsive element which enhances the expression of 
the glutathione S-transferase in the presence of antioxidant molecules) may be used. Such artificial 
DNA constructs confers to recombinant cells an increased level of resistance to an antineoplastic 
agent. 

There is a need for selective inhibitors of GST isoenzymes for treatment of drug resistance 
20 in cancer patients. Thus, in a further embodiment of this invention, the proteins of the invention or 
fragments thereof can be used for screening of the compounds which are selective inhibitors or 
specific inhibitors of one or more GST isoenzymes. Selective inhibition means that a compound has 
a greater inhibitory effect on one isoenzyme than it does on another GST isoenzyme. Such 
compounds could also be tested and selected for their ability to overcome drug resistance to 
25 chemotherapeutic agents (see Jones , et al. United States Patent 6,103,665 Aug 2000). For example, 
mammalian cell lines that have been made resistant to particular chemotherapeutic drugs can be 
used to identify haloenol lactone compounds that render the lines sensitive to the chemotherapeutic 
agents. Such cell lines are known to those of skill in the art and can be obtained for example from 
the American Type Culture Collection, Rockville, Md., USA. 

30 Protein of Sea Id No: GRP (1 87-38-0-0-1 10-CS) 

The protein of SEQ ID No: 306, herein referred as GRP, encoded by the cDNA of SEQ ID 

No: 65 herein referred as GRP2, is homologous to bovine glutamic-acid rich protein (GARP) 

(GENEPEPT ID: M61 185). The protein of the invention is overexpressed in the brain and fetal 

brain, lymph ganglia and thyroid. 
35 The protein of the invention exhibits homology with bovine glutamic-acid rich protein 

(GARP) (18 % identical amino acids, 28% positive amino acids when aligned by BLASTP 2.0.9). 
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GARP proteins have been identified as multivalent proteins that interact the key players of 
cGMP signaling, phosphodiesterase and guanylate cyclase. GARP proteins are closely associated 
to cyclic nucleotide-gated channels (CNGs) which make up a family of nonselective cation channels 
found in a variety of tissues. The beta subunit of CNGs have a unique bipartite structure, containing 
5 a membrane-spanning region (beta part) and a GARP part (GARP). GARP is highly homologous to 
a soluble splice form, GARP1, and a splice variant lacking the C-terminal glutamic-acid-rich 
region. Experiments using GARP attached to affinity columns showed that phosphodiesterases are 
highly retained by the column [Korschen HG et al., Nature, 400(6746):761-766 (1999)]. Moreover, 
Korschen et al. demonstrated that GARP inhibits both soluble and membrane bound 

1 0 phosphodiesterase. 

Cyclic nucleotides are involved in regulating the activity of airway smooth muscle and 
many other cells in the airways, including pro-inflammatory, immunocompetent cells such as 
macrophages, eosinophils, mast cells and lymphocytes. Cyclic nucleotides are inactivated by the 
action of cyclic nucleotide phosphodiesterase enzymes (PDE). Inhibition of cGMP PDE results in 

15 elevation of cGMP levels; elevated cGMP levels are associated with beneficial anti-platelet, anti- 
neutrophil, anti -vasospastic and vasodilatory activity. 

Thus, the subject invention provides a polypeptide having the sequence of SEQ ID No: 306 
or a GRP polypeptide encoded by the human cDNA of clone 187-38-0-0-110. In a preferred 
embodiment, GRP is encoded by the sequence of SEQ ID No: 65 or the human cDNA of clone 187- 

20 38-0-0-110, however, all polynucleotides encoding the polypeptides of the invention are included. 
As used herein, "the GRP protein" includes the full length protein of SEQ ID NO: 306 as well as 
biologically active fragments of the GRP protein. Also encompassed by the phrase "the GRP 
protein" are variants of the protein of SEQ ID NO: 306 and biologically active fragments of said 
variant proteins. 

25 "Biologically active fragments" are defined as those peptide or polypeptide fragments of 

GRP which have at least one of the biological functions of the full length protein (e.g., the ability to 
inhibit the activity of PDE or serve as an affinity substrate for PDEs). 

The invention also provides variants of the protein of the GRP protein encoded by SEQ ID 
NO: 306. These variants have at least about 80%, more preferably at least about 90%, and most 

30 preferably at least about 95% amino acid sequence identity to the amino acid sequence of GRP. 
Variants according to the subject invention also have at least one functional or structural 
characteristic of GRP, such as the ability to inhibit the activity of PDE or serve as an affinity 
substrate for PDEs. The invention also provides biologically active fragments of the variant 
proteins. Unless otherwise indicated, the methods disclosed herein can be practiced utilizing GRP 

35 or variants thereof. Likewise, the methods of the subject invention can be practiced using 
biologically active fragments of GRP, or biologically active fragments of GRP variants. 
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Assays related to the inhibitory effect of the protein of the invention can be carried out 
using techniques described in U.S. Patent No. 6,130,333, hereby incorporated by reference in its 
entirety; or by any other technique known to those skilled in the art. 

One aspect of the subject invention provides compositions and methods of using the 
5 nucleotide sequence of SEQ ID NO: 65, or its complement, in molecular biology techniques. These 
techniques include, but are not limited to: the use of segments of GRP2 as oligomers for PCR; 
expression of the GRP2 and the production of recombinant proteins; in generation of antisense 
RNA and DNA, their chemical analogs and the like; the use of GRP2 segments as hybridization 
probes and in chromosome gene mapping. 

10 For example, nucleotide sequence of SEQ ID No: 65, or its complement, can be used to 

generate hybridization probes for mapping the naturally occurring genomic sequence. The 
sequence can be mapped to a particular chromosome or to a specific region of the chromosome 
using well known techniques. These include in situ hybridization to chromosomal spreads, flow- 
sorted chromosomal preparations, or artificial chromosome constructions such as yeast artificial 

1 5 chromosomes, bacterial artificial chromosomes, bacterial PI constructions, or single chromosome 
cDNA libraries as reviewed in Price (Price CM - Blood Rev. - 1993, 7(2): 127-34) and Trask B 
(Trask BJ - Trends Genet. - 1991, 7(5): 149-54). 

In situ hybridization of chromosomal preparations and physical mapping techniques, such 
as linkage analysis using established chromosomal markers, are invaluable in extending genetic 

20 maps; genetic maps provide valuable information to investigators searching for disease-causing 
genes using positional cloning or other gene discovery techniques. The nucleotide sequence of the 
present invention can also be used to detect differences in the chromosomal location due to 
translocation, inversion, etc. among normal, carrier or affected individuals. 

Another embodiment of the subject invention provides pharmaceutical compositions 

25 comprising the GRP protein and pharmaceutical^ acceptable carriers. These pharmaceutical 
compositions can be used in prophylaxis and/or treatment of a variety of conditions where 
inhibition of phosphodiesterase is considered to be beneficial. The biochemical, physiological, and 
clinical effects of phosphodiesterases inhibitors suggest their utility in a variety of disease states in 
which modulation of smooth muscle, renal, hemostatic, inflammatory, and/or endocrine function is 

30 desirable. Therefore, the GRP protein can be used for the treatment or prophylaxis of a number of 
disorders and conditions including, but not limited to, stable, unstable, and variant (Prinzmetal) 
angina; hypertension; pulmonary hypertension; congestive heart failure; acute respiratory distress 
syndrome; acute and chronic renal failure; atherosclerosis; conditions of reduced blood vessel 
patency (e.g., postpercutaneous transluminal coronary or carotid angioplasty, or post-bypass surgery 

35 graft stenosis); peripheral vascular disease; vascular disorders, such as Raynaud's disease, 
thrombocythemia, intermittent claudication; immune diseases, multiple sclerosis; cancers 
inflammatory diseases, graft versus host disease, Alzheimer's disease, memory deficits, , stroke, 
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bronchitis, chronic asthma, acute lung injury, chronic obstructive pulmonary disease, allergic 
asthma, allergic rhinitis; glaucoma; osteoporosis; preterm labor; benign prostatic hypertrophy; male 
and female erectile dysfunction; and diseases characterized by disorders of gut motility (e.g., 
irritable bowel syndrome). 
5 The GRP protein of the invention can also provide beneficial anti-platelet, anti-neutrophil, 

anti-vasospastic, vasodilatory, natriuretic, and diuretic activities when administered in 
therapeutically effective amounts. The GRP protein can also potentiate the effects of endothelium- 
derived relaxing factor (EDRF), gastric NO administration, nitrovasodilators, atrial natriuretic factor 
(ANF), brain natriuretic peptide (BNP), C-type natriuretic peptide (CNP), and endothelium- 
10 dependent relaxing agents such as bradykinin and acetylcholine, when administered to an 
individual. 

Another embodiment of the subject provides methods of treating male erectile dysfunction 
comprising the administration of therapeutically effective amounts of the GRP protein using 
appropriate methods known to the skilled artisan. 

15 Another embodiment of the subject invention provides industrially significant methods of 

recovering PDE comprising contacting solutions containing PDE with immobilized GRP protein. 
In this aspect of the invention, the GRP protein is immobilized onto a solid support and allowed to 
specifically bind to PDE contained in a solution or sample. PDE can then be eluted from the 
immobilized GRP protein according to methods known to the skilled artisan (see, for example, 

20 Korschen et al., supra). PDE is a commercially valuable commodity sold by various vendors. 

Protein of SEP ID NO: 302 (internal designation 187-2-2-0-A3-CS) 

The protein of SEQ ID NO: 302 encoded by the cDNA of SEQ ID No: 61 is related to a 

neuronally expressed protein (neuritin, Genseq accession number W37859) known to have a role in 

neurogenesis and axonal and dendritic growth. 
25 The 164 amino acid protein of SEQ ID NO: 302 is 24% identical to neuritin over the 

complete sequence. Specifically, SEQ ID NO: 302 displays two blocks of strong homology to 

neuritin (amino acids 41-60 of SEQ ED NO: 302 display 55% identity and 95% similarity to amino 

acids 30-49 of neuritin, and amino acids 66-1 17 of SEQ ID NO: 302 display 32% identity and 57% 

similarity to amino acids 62-113 of neuritin). The C-terminal portion of neuritin (aa 116-142 of 
30 neuritin) is highly hydrophobic and contains a cleavage site found in GPI-anchored proteins. The 

protein of SEQ ID NO: 302 also has a hydrophobic C-terminus (21 out of the last 30 amino acids 

are hydrophobic) and conforms to the GPI anchor consensus sequence. 

Neuritin, also known as candidate plasticity-related gene number 15 (cpg-15), was 

independently identified by two groups from differential cDNA libraries generated from kainic 
35 acid-treated hippocampal cells (Nedivi et al., Nature. 363:718-22 (1993); Naeve et al., Proc. Natl. 

Acad. Sci. USA. 94:2648-53 (1997)). Neuritin is a secreted protein that contains a potential GPI 
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anchoring domain believed to anchor the protein to the membranes of target cells. Neuritin is 
expressed strongly in the brain, and in particular, in systems with pronounced developmental 
plasticity, including the pyramidal neurons of the cornus ammons and the granule cells of the 
hippocampus dentate gyrus. Strong expression is also observed in layers of tenia tecta projecting to 
5 the olfactory bulb, the major target of the retinal ganglion cells, and the optical nerve layer of the 
superior colliculus (optic tectum); and localized expression is observed in the thalamic nuclei and 
the cerebral cortex (Nedivi et al., Pro. Natl. Acad. Sci. USA. 93:2048-53 (1996); Naeve et al. s Proc. 
Natl. Acad. Sci. USA. 94:2648-53 (1997); Nedivi et al., Science. 281 :1863-66 (1998)). mRNA is 
expressed throughout development and persists into adulthood. In addition, neuritin expression is 

10 upregulated in adults by brain derived neurotrophic factor (BDNF). Neuritin mRNA is also 

detected in the lung and the liver, although at lower levels than that observed in the CNS (Naeve et 
al., Proc. Natl. Acad. Sci. USA. 94:2648-53 (1997)). 

Functional studies on the neuritin protein have revealed a role in neuronal growth. In one 
such study, rat cortical and hippocampal neurons were treated with recombinant forms of neuritin. 

1 5 Neurons treated with neuritin showed extensive neuritogenesis over control cultures. Specifically, 
neurons showed well-differentiated cell bodies with well-defined extensions after treatment with 
neuritin (Naeve et al., Proc. Natl. Acad. Sci. USA. 94:2648-53 (1997)). Other studies using frog 
optic tectum showed that transfection of tectum cells with neuritin cDNA can increase the growth 
rate of tectal cell dendrites (Nedivi et al., Science. 281:1863-66 (1998)). Studies have also shown 

20 that neuritin can modify the growth of retinotectal axons by increasing the elaboration of 

presynaptic axons and can promote the maturation of retinal tectal synapses (Cantallops et al., 
Nature Neuroscience. 3:1004-101 1 (2000)). Together, these results indicate that neuritin promotes 
the growth of pre- and post-synaptic neurons and contributes to the formation and stabilization of 
mature synapses. 

25 The subject invention provides the polypeptide of SEQ ID NO: 302 and polynucleotide 

sequences encoding the amino acid sequence of SEQ ID NO: 302. In one embodiment, the 
polypeptides of SEQ ID NO: 302, including fragments, variants, etc. are replaced by the 
corresponding polypeptide encoded by the human cDNA of clone 187-2-2-0-A3-CS. Also included 
in the invention are biologically active fragments of the protein of SEQ ID NO: 302 and 

30 polynucleotide sequences encoding these biologically active fragments. In another embodiment, 
biologically active fragments comprise amino acid positions 41-60, 66-1 17, 41-1 17, and 41-164. In 
another embodiment, these fragments may be joined together by chemical linkers or by 
recombinantly inserted amino acid linker segments according to methods known in the art. 
"Biologically active fragments" are defined as those peptide or polypeptide fragments of SEQ ID 

35 NO: 302 which have at least one of the biological functions of the full length protein (e.g., the 
ability to stimulate neurogenesis and axonal and dendritic growth). 
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The invention also provides variants of SEQ ID NO: 302. These variants have at least 
about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ED NO: 302. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 302, 
5 such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 
practiced utilizing the polypeptide of SEQ ID NO: 302 or variants thereof. Likewise, the methods 
of the subject invention can be practiced using biological fragments of the protein of SEQ ID NO: 
302 or variants of said biologically active fragments. 

10 Because of the redundancy of the genetic code, a variety of different DNA sequences can 

encode SEQ ID NO: 302. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences, which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same sequence" refers to sequences that have amino 

15 acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: 302 are also 
included in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 

20 code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

The protein of SEQ ID NO: 302, and variants thereof, can be used to produce antibodies 
according to methods well known in the art. The antibodies can be monoclonal or polyclonal. 

25 Antibodies can also be synthesized against immunogenic fragments of SEQ ID NO: 302, as well as 
variants thereof, according to known methods. The subject invention also provides antibodies 
which specifically bind to biologically active fragments of SEQ ID NO: 302 or biologically active 
fragments of SEQ ID NO: 302 variants. 

The protein of SEQ ID NO: 302 can be utilized to treat diseases and disorders of the central 

30 or peripheral nervous system which arise from alterations in the pattern of expression of the protein 
of SEQ ID NO: 302. In this aspect of the subject invention, compositions comprising the protein of 
SEQ ID NO: 302 and a pharmaceutical carrier are administered to an individual in need thereof. 
Alternatively, in cases where the protein of SEQ ID NO: 302 is overexpressed, reductions in SEQ 
ID NO: 302 levels may be accomplished by a variety of methods known to those of skill in the art. 

35 These methods include the introduction of neutralizing antibodies or the use of antisense 
polynucleotides derived from the protein of SEQ ID NO: 302 or clone 187-2-2-0-A3-CS. 
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The subject invention also provides materials and methods for the treatment of neurological 
disorders comprising contacting neuronal cells with compositions comprising the protein of SEQ ID 
NO: 302 and pharmaceutical^ acceptable carriers. Thus, this aspect of the invention provides 
methods of treating patients suffering from a variety of neurological disorders, conditions, and/or 
5 diseases of the central, autonomic, or peripheral nervous system. These include neurological 
damage arising from congenital disease, trauma, surgery, stroke, ischemia, infection, metabolic 
disease, nutritional deficiency, malignancy, and/or exposure to toxic agents. Additional examples 
of such disorders include, but are not limited to, epilepsy, cerebral neutralisms, Alzheimer's disease, 
Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal 

10 disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural 
muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other 
demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural 
abscess, auppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous 
system disease, prion diseases, Creutzfeldt-Jakob disease, Gerstmann-Staussler-Scheinker 

15 syndrome, fatal familial insomnia, diabetes induced peripheral neuropathy or neuropathy induced 
by other metabolic disorders or nutritional deficiencies, neurofibromatosis, tuberous sclerosis, 
cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and 
other developmental disorders of the central nervous system, cerebral palsy, neuroskeletal disorders, 
autonomic nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular 

20 dystrophy and other neuromuscular disorders, dermatomyositis and polymyostis, inherited, 

metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders 
including mood, anxiety, and schizophrenic disorders, seasonal affective disorder, akathesia, 
amnesia, and/or other dystrophies or degenerative disorders of the visual, sensory, olfactory, 
auditory, motor, or memory systems. Methods of introducing therapeutic compounds into cells are 

25 known to those skilled in the art. Non-limiting examples include the use of targeted liposomes, 

fusogenic liposomes, or other carriers suitable for the introduction of a therapeutic compound into a 
target cell. 

The subject invention also provides materials and methods for the treatment of neurological 
disorders comprising contacting neuronal cells with compositions comprising polynucleotides 
30 encoding the protein of SEQ ID NO: 302 and pharmaceutical^ acceptable carriers. In one 
embodiment, the polynucleotide is clone 187-2-2-0-A3-CS. Methods of introducing 
polynucleotides into cells and directing expression of the polynucleotide are known to those skilled 
in the art. 

Antibodies raised against the protein of SEQ ID NO: 302 may be used in a variety of 
35 immunoassays known to those skilled in the art. In this aspect of the invention, immunoassay 
screening for abnormal levels of the protein of SEQ ID NO: 302 can be used as screens or 
diagnostic/prognostic indicators of neurodegenerative disease. 

209 

BNSDOCID: <WO 01 42451 A2_l_> 



WO 01/42451 PCT/IB00/01938 

Antibodies raised against the protein of SEQ ID NO: 302, fragments, and/or derivatives 
thereof may also be used for detection and identification of growing and differentiating neurons 
including, but not limited to, the pyramidal neurons of the cornus ammons, the granule cells of the 
hippocampus dentate gyrus, neurons in layers of tenia tecta projecting to the olfactory bulb, the 
5 optical nerve layers of the superior colliculus (optic tectum), and neurons of the thalamic nuclei and 
the cerebral cortex. 

Protein of SEP ID NO:301 (187-12-4-0^8-05^ 

The protein of SEQ ID No:301, encoded by the cDNA of SEQ ID NO:60, is homologous to 
the Eukaryotic cell growth inhibiting factor (GENESEQP: R95950) described in patent 
10 W096 17933. The protein of the invention is highly expressed in the brain, fetal brain, fetal liver 
and the prostate. 

It is believed that the protein of the invention is a cell growth inhibiting factor. Preferred 
polypeptides of the invention are those that comprise amino acids 221 to 287. Other preferred 
polypeptides of the invention are any fragment of SEQ ED NO:301 having any of the biological 

15 activities described herein. In the present invention, a cell inhibiting factor is defined as a peptide 
or protein that decreases, suppresses or terminates (reversibly or irreversibly) the growth of at least 
one type of cell such as, but not limited to, bacteria, yeast, vertebrate cells, mammalian cells and 
human cells, under ordinary culturing conditions known to those skilled in the art. Assay of the 
inhibiting activity of the invention can be carried out, for example, by evaluating the decrease in 

20 DNA synthesis as described in Patent WO 96/1 7933, or by measuring the number or density of cells 
using any standard method. For example, fibroblasts are transfected with a vector containing the 
DNA sequence coding for the protein of the invention or part thereof. Cells are then cultured in a 
standard medium, exposed to tritiated thymidine, and further cultured. The cultures are then fixed 
and stained with X-Gal, the blue stained galactosidase-expressing cells are counted under a 

25 microscope, and the ratio of cells showing dark particles in their nuclei due to tritiated thymidine 
uptake is determined. DNA synthesis inhibitory rates are calculated with the labeling index taking 
for reference (i.e. 0% inhibition) a culture of cells tranfected with a "blank" vector (i.e. not modified 
to contain the DNA coding for the protein of the invention or part thereof). 

Aging at the cell level is associated with individual aging. The maximum possible number 

30 of divisions (division life span) of cultured cells is inversely proportional to individual age. Even if 
an aged cell is fused with a young or immortalized cell, DNA synthesis does not occur again in aged 
cells; on the contrary, DNA synthesis in the young and immortalized cell is suppressed (Stein GH et 
al. - Proc Nat Acad Sci. - 1981, 78:p3025). This demonstrates that certain factors controlling 
cellular senescence are dominant, and that aged cells not only lack substances essential for their 

35 growth but also have substances that actively suppress DNA synthesis. Moreover, microinjections 
of mRNA, prepared from an aged cell, are known to inhibit DNA synthesis (Lumpkin CK et al. - 
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Science - 1986, 232:p393). Therefore as cells age, there are some genes that are newly expressed 
or whose expression is increased. Such genes play an important role, directly or indirectly, in cell 
aging. 

Pereira-Smith et al. tested the complementation of a large number of immortalized human 
5 cells in fused pairs and demonstrated the presence of 4 groups of human aging genes (Pereira-Smith 
OM et al. - Proc Nat Acad Sci. - 1988, 85:p6042). Clarifying the nature of aging-associated genes 
is not only important in understanding aging, both at the cellular and individual levels, but is also 
significant in that the use of these genes or gene products would enable the diagnosis of various 
aging-associated diseases and diseases caused by cellular senescence, the development of 

10 prophylactic/therapeutic drugs for such diseases, and their application as prophylactic/therapeutic 
drugs for various diseases involving uncontrolled cell growth such as, but not limited to, cancers. 

In one embodiment of the present invention, the polypeptides and polynucleotides of the 
invention are used to specifically label cells of the brain, fetal brain, fetal liver and the prostate, as 
the protein is strongly expressed in these tissues. The ability to specifically detect these tissues, and 

15 cells derived from these tissues, has a number of uses, including for the determination of the history 
of tumor cells and for histological analyses. 

An embodiment of the present invention invention relates to methods and compositions of 
using the protein of SEQ ID NO:301 or the cDNA of SEQ ID NO:60 or any part thereof, to inhibit 
cell proliferation in vitro. For example, by including the invention in a "cocktail" with other 

20 proteins (such as proteases) it could be used as a decontaminant, i.e. to prevent the growth of any 
cells to maintain a sterile environment. Preferred applications of this embodiment include 
decontamination of samples (such as cell culture media) and instruments (such as surgical 
instruments), where the invention would be used as a bacteriostatic/mycostatic agent. Another 
example pertains to the use of the protein of the invention as a reagent for terminating the cell cycle 

25 of cultured cells at a given time point, e.g., as a reagent for synchronizing cell division, avoiding the 
need to isolate specific cells (e.g. at the desired cell cycle phase) in cultures using techniques such 
as flow cytometry. Synchronization of the cell cycle could, for example, be achieved, e.g., by 
transfecting cells with an appropriate vector containing the DNA coding for the protein of the 
invention or part thereof, where expression of the protein results in growth inhibition. Then, after a 

30 certain time, an inhibitor of the protein could be administered in order to enable the cells to resume 
growth (e.g. all at the S phase, when DNA is synthesized). Further, the ability to synchronize the 
cell cycle in an in vitro experimental system would provide improved assay precision or would 
facilitate any laboratory procedure or experiment involving a particular cell cycle stage. Use of the 
invention for in vitro inhibition of cell proliferation is not limited to the above examples; the 

35 invention is potentially useful in any in vitro application that requires the inhibition of cellular 
proliferation. 
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Another embodiment of the invention pertains to the introduction of SEQ ID NO:301 or 
SEQ ID NO:60, or any part thereof, into a target tissue (such as skin or vascular endothelium) to 
establish an in vitro aged cell line of the target tissue. Such a cell line is useful as a screening 
system for clarifying the mechanisms of aging and/or cellular senescence, but also for seeking 
5 prophylactic and therapeutic drugs for aging-associated diseases and or diseases caused by cellular 
senescence, as it provides an in vitro model of aged cells (in vitro cell cultures provide the 
advantage of being easily produced and are not as expensive as animal models). Thus these cells 
are potentially useful in drug candidate screening applications; a preferred application involves its 
use in high-throughput screening, e.g. to identified "lead compounds." 

1 0 A preferred embodiment of the invention relates to the use of the cDNA of SEQ ID NO:60 

or part thereof, as a probe for examining individual aging at the gene expression level. Specifically, 
SEQ ID NO:60, or part thereof, can be used as a diagnostic reagent for various aging-associated 
diseases such as, but not limited to: arteriosclerosis, osteo-arthritis, dementia (including 
Alzheimer's disease) and Parkinson's disease. 

15 In a related embodiment, the cDNA of SEQ ID NO:60 or part thereof, could be used to 

synthesize antisense oligonucleotides by methods well known to those skilled in the art. Antisense 
oligonucleotides can be used to inhibit the synthesis of the protein of SEQ ID NO:301, thereby 
preventing cell and tissue aging and/or promoting the rejuvenation of aged cells and tissues. These 
antisense oligonucleotides can also be used for in vivo or ex vivo treatment and prophylaxis of 

20 diseases caused by cellular senescence or aging-associated diseases such as, but not limited to: 
arteriosclerosis, osteo-arthritis, dementia (including Alzheimer's disease) and Parkinson's disease. 

In a most preferred embodiment, SEQ ID NO:301, SEQ ID NO:60, or any part thereof, can 
be used as a pharmaceutical drug to treat pathologies such as, but not limited to cancers, 
inflammation, or infections. For example, when used as an antibacterial, antiviral and/or antifungal 

25 agent, inhibition of microbial proliferation could be achieved by either directly inhibiting 

microorganism growth (in the case of fungal and bacterial infections) or DNA synthesis of infected 
cells in the case of viral infections. The DNA of SEQ ID NO:60 or part thereof, can also be used to 
develop gene therapy products in the in vivo or ex vivo treatment of diseases and conditions such as 
cancers and inflammation. SEQ ID NO:301, SEQ ID NO:60, or any part thereof, may also be used: 

30 1) as a probe for the diagnosis, 2) as a prophylaxis or 3) as a treatment of aging-associated diseases 
such as, but not limited to: arteriosclerosis, osteo-arthritis, dementia (including Alzheimer's disease) 
and Parkinson's disease. In a related embodiment, the protein of SEQ ID NO:301 or the cDNA of 
SEQ ID NO:60, or any part thereof, can be used in the development of drugs that will be used in the 
prophylaxis or treatment of the diseases stated above (e.g. diseases caused by cellular senescence, 

35 aging-associated diseases and diseases caused by cellular proliferation). 

For the treatment or prevention of diseases and conditions associated with undesired 
proliferation, such as cancer, inflammation, or infection, the expression or activity of the present 
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protein can be increased using any of a number of methods. For example, polynucleotides encoding 
the protein can be introduced into the undesired cells, wherein the protein is then expressed and 
inhibits the further growth of the cells. In one such embodiment, the polynucleotides can be 
incorporated into liposomes comprising on their surface a specific molecule that directs the 
5 targeting of the liposome to a specific cell type (e.g. a tumor-specific antibody). Alternatively, the 
protein of SEQ ID NO:301 can itself be administered to the cells, e.g. as a fusion protein also 
comprising a specific targeting polypeptide moiety. Further, a compound that enhances the 
expression or activity of the protein can be administered to cells, preferably in a way that 
specifically targets the compound to undesired cells, e.g. chemically linked to a heterologous 
10 specific targeting molecule. 



Protein of SEP ID NO : 412 (internal designation 1 87-5-3-Q-C7-CS) 

The protein of SEQ ED NO : 412 encoded by the cDNA of SEQ ID NO : 17 1 is homologous 

to the human CDK4-binding protein p34 SEM (sptrembl accession number Q9UHV2). p34 SEI " i is a 
15 new CDK4 regulator that prevents pl6INK4a from inhibiting the formation of cyclinDl-CDK4 

complexes. p34 SEI ~ seems to act as a growth factor sensor and may facilitate the formation and 

activation of cyclin D-CDK complexes in the face of inhibitory levels of INK4 proteins (Sugimoto 

et al., Genes Dev. 13:3027-3033 (1999)). 

Progression through the cell cycle is a complex process that is regulated at many levels by 
20 several proteins. The activity of cyclin dependent kinases (CDK4 and CDK6) is regulated by the 

association of cyclin partner that acts as a positive effector and by two families of cdk inhibitors 

proteins (KIP) and the inhibitors of cdk4 (INK4) such as pl6INK4a, which act as negative effectors 

(Sandhu et al., Cancer Detect. Prev.24: 107-1 18 (2000)). 

Cancer is a disease characterized by loss of cellular growth control, the molecular 
25 machinery of the cell cycle is involved in tumorigenesis. Many human tumors have been shown to 

have abnormality in this pathway resulting in either the functional inactivation of pl6INK4a or the 

excessive activity of CDK4 (Palmero at al., Cancer Surv.27:351-357 (1996)). 

It is believed that the protein SEQ ID No: 412 plays a role in the cell cycle regulation via 

the binding to a cyclin dependent kinase. Other preferred polypeptides of the invention are 
30 fragments of SEQ ID NO: 412 having any of the biological activity described herein. The binding 

activity of the protein of the invention or part thereof to a cyclin dependent kinase, as well as its role 

in cell cycle, may be assayed using any of the assays known to those skilled in the art including 

those described in Sugimoto et al., supra. 

An embodiment of the present invention relates to methods of using the protein of the 
35 invention or part thereof to identify and/or quantify cyclin dependent kinases, preferably CDK4, in 

a biological sample, and thus used in assays and diagnostic kits for the quantification of such CDKs 
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in bodily fluids, in tissue samples, and in mammalian cell cultures. The binding activity of the 
protein of the invention or part thereof may be assessed using the assay described in Sugimoto et al., 
supra or any other method familiar to those skilled in the art. Preferably, a defined quantity of the 
protein of the invention or part thereof is added to the sample under conditions allowing the 
5 formation of a complex between the protein of the invention or part thereof and the cyclin 

dependent kinase to be identified and/or quantified. Then, the presence of the complex and/or or 
the free protein of the invention or part thereof is assayed and eventually compared to a control 
using any of the techniques known by those skilled in the art. 

In another embodiment, the invention relates to compositions and methods using the protein 

10 of the invention or part thereof to stimulate cell proliferation both in vitro and in vivo. For example, 
soluble forms of the protein of the invention or part thereof may be added to cell culture medium in 
an amount effective to stimulate cell proliferation. 

The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders associated with cell proliferation 

15 including but are not limited to, adenocarcinoma, sarcoma, lymphoma, leukemia, melanoma, 
myeloma, teratocarcinoma, cancers of the adrenal gland, bladder, bone, brain, breast, 
gastrointestinal tract, heart, kidney, liver, lung, ovary, pancreas, paraganglia, parathyroid, prostate, 
salivary gland, skin, spleen, testis, thyroid, uterus, and neurodegenerative disorders such as 
Alzheimer's disease (McShea et al., Am.J.Pathol.l50(6):1933-1939 (1997)). For diagnostic 

20 purposes, quantification of the protein of the invention could be investigated, using Northern 

blotting, RT-PCR, immunoblotting and any of protocols known in the art, in biological samples and 
compared to the expression in control biological samples. Thus a diagnosis assay may be used, to 
determine altered expression of the protein of the invention, to correlate with diseases states and to 
evaluate the prognostic significance in diseases. For prevention and/or treatment purposes, 

25 inhibition of the endogenous expression of the protein of the invention using any of the antisense or 
triple helix methods described herein may be used. Alternatively, inhibitors for the protein's activity 
may be developed and use to inhibit and/or to reduce the protein's activity using any methods 
known to those skilled in the art. Antibodies which specifically bind to the protein of the invention 
may be generated using methods that are well known in the art and used as an antagonist. 

30 Protein of SEP ID NO : 299 (internal designation 184-1-4-0-C1 1-CS) 

The protein of SEQ ID NO : 299 encoded by the cDNA of SEQ ID NO: 58 and found in 
fetal liver and liver, is orthologous to the BolA protein. The BolA family comprises the morpho- 
protein BolA from E. coli and its various hornologs. The expression of BolA is growth rate 
regulated and is induced during the transition into the stationary phase. BolA is also induced by 

35 stress during early stages of growth and can have a general role in stress response. It has also been 
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suggested that BolA can induce the transcription of penicillin binding protein 6 and 5 (EMBO J. 1;8 
(2) :3923-31 (1989)). 

E. coli cells become thinner and shorter after a period of starvation or stationary-phase 
conditions; this altered morphology is an adaptative response of E. coli to general forms of stress. 
5 The bolA gene seems to be involved in the switching between cell elongation and septation systems 
during the cell division cycle [J Bacteriol 170 :5169-5176 (1988)]. The regulation of bolA has been 
linked to the presence of gearbox promoter from which RNA is transcribed [Mol Microbiol 5 :2085- 
2091 (1991)]. 

Expression of bolA is governed by two promoters. P2 is located further upstream from the 
10 structural gene, is under the control of o d and transcribes bolA constitutively. The promoter PI, 

proximal to the structural gene, is a gearbox promoter under the control of oD s from which bolA has 
been shown to be transcribed in an inverse growth rate-dependent fashion [J Bacteriol 173 :4474- 
4481 (1991)]. 

The alternate sigma factor oD s is encoded by the gene rpoS and has been described as a 
15 central regulator for the induction of a set of specific genes involved in adaptation to stationary 

phase. It has, nevertheless, been shown that oD s function is not confined to stationary phase. 

Significant increases in aD s cellular levels were seen during exponential growth in response to 

forms of stress; genes under its control code for important adaptive regulators for general stress 

conditions [FEMS Microbiol Lett 30 :419-430 (1997)]. 
20 The smaller morphology caused by stress-induced overexpression of bolA reduces the 

surface area exposed to the environment and decreases the cell's surface-to-volume ratio. 

Identification of ortholog genes provides important information regarding functional and 

structural conservation within these orthologs throughout evolution. The concept of comparative 

gene identification has been previously used by many laboratories to search for orthologous genes 
25 once a particular gene of interest has been identified in another species [Genome Res 10 (5) : 703- 

13(2000)]. 

The protein of invention contains a signal peptide corresponding to a short helix as 
predicted by software TopPred II [Clarosand and von Heijne, CABIOS applic. Notes, 10: 685-686 
(1994)]. Thus, one aspect of this invention provides materials and methods for the delivery of 

30 recombinant proteins to liver cells. The signal peptide, encoded by an appropriate polynucleotide, 
can be linked to another protein/polypeptide (also encoded by an appropriate polynucleotide). The 
recombinant gene, containing the signal peptide sequence, is expressed and the desired protein is 
delivered via the signal peptide. Methods of producing such gene fusions, the expression of such 
gene products, and their use are well known to the skilled artisan. 

35 In another embodiment, BolA, or biologically active fragments thereof, can be used to 

modulate the stress response to environmental changes such as cytotoxic agents, heat shock, 
irradiation, genotoxic stress or growth factors. 
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In another aspect of the invention, SEQ ID NO: 299 is incorporated into a prokaryotic 
expression vector and transfected into prokaryotic cells unable to adapt to environmental stress or 
cells containing a BolA defect. The expression vector can, optionally, contain a promoter system 
such as that described supra (e.g., PI, P2, o d , o s , etc.) which typically controls bolA expression. The 
5 components necessary for transcription can be provided in one or more expression vectors. 

Prokaryotic cells thus transformed can be useful in bioremediation systems where environmental 
stress is commonly encountered. Thus, preferred prokaryotes for the practice of this aspect of the 
invention lack bolA, or contain a bolA defect, and are known to be useful for bioremediation. 
In another embodiment, the subject invention provides methods and compositions to 

10 selectively identify liver tissues. The protein encoded by SEQ ID NO: 299 can be used to 
synthesize specific polyclonal or monoclonal antibodies using any techniques known to those 
skilled in the art. These antibodies can be used to selectively identify liver tissue according to well- 
known histological immunoassays. The ability to immunologically identify tissue samples is 
industrially important for analysis of mismarked biopsy samples (e.g., laboratory errors) where the 

15 origin of the tissue sample is in question, or simply to verify that a tissue sample originated from 
liver. The antibodies can also be used to identify cancer metastases originating from the liver. 

Further, antibodies provided by the subject invention can also be used to assay animal feeds 
for the presence of liver or liver by-products. As is known, many animal feeds contain animal 
protein. The use of animal feeds containing animal protein has been associated with disorders in 

20 both animals and humans (the most notorious of which is bovine spongiform encephalitis). This 
has resulted in the banning of animal protein in feeds provided to animals. However, to ensure 
compliance with such bans, animal feeds must be tested. Thus, the antibodies of the invention can 
be used to test animal feeds for the presence of liver according to methods known to those skilled in 
the art. 

25 Proteins of SEP ID NOs: 249 and 288 (internal designation 105-037-2-0-H1 1-CS and 174-7-4-0- 
Hl-CS respectively). 

The 403-amino-acid-long protein of SEQ ID NO: 249 encoded by the cDNA of SEQ ID 
NO: 8 is extensively homologous to the protein of SEQ ID NO: 288 encoded by the cDNA of SEQ 
ID NO: 47 with the exception of five amino acids in positions 192-194, and 298-299 which are not 

30 present in protein of SEQ ID NO: 288. It is likely that the two proteins are the result of an 
alternative splicing and display similar functions and utilities. 

The 403-amino-acid-long protein of SEQ ID NO: 249, overexpressed in salivary glands, 
exhibits extensive homology to the mus musculus hypothetical protein (Genbank accession number 
AB030196). The amino acid residues of protein of SEQ ID: 249 show a high degree of identity to 

35 the Genbank sequence. However, the protein of Genbank sequence does not have the twenty amino 
acids (192 to 194, and 298-303, 353-354, 380-381, and 387-393) and also displays 35 different 
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amino acids from the SEQ ID NO: 249. In addition, four transmembrane domains are predicted for 
the protein of SEQ ID NO: 249 from positions 31 to 51, 75 to 95, 154 to 174, and from 236 to 256 
as predicted by the software TopPred II (Claros and von Heijne, CABIOS applic. Notes, 10 : 685- 
686(1994)). 

5 When expressed in E, Coli, the matched sequence suppresses bacterial growth (Inoue et al, 

Biochem Biophys Res Commun 268:553-61 (2000)). It is therefore believed that the proteins of 
SEQ ID NO: 249 and 288 or a bacterial growth suppressing fragment thereof can be used to 
suppress bacterial growth by contacting bacteria (gram negative or gram positive) with the 
polypeptides of the invention. The growth inhibiting activity of the protein of the invention or part 

10 thereof may be assayed using any of the assays known to those skilled in the art including those 
described in Inoue et al, supra. 

In accordance with one aspect of the invention, methods and compositions using the protein 
of the invention or a fragment thereof to suppress bacterial growth are provided. In a preferred 
embodiment, the protein of the invention is expressed in a bacteria, preferably E. coli, using 

1 5 recombinant DNA technology methods known to those skilled in the art. The expressed protein can 
then be used to inhibit bacterial growth. The effects of the expressed protein and analogs or 
antagonists thereof can be assessed using any methods or techniques known to those skilled in the 
art. 

Further included in the invention are the polypeptides encoded by the human cDNA of 
20 clone 1 05-037-2-0-H 1 1 -CS-SD. The polypeptides of SEQ ID NO: 249 may be interchanged with 
the corresponding polypeptides encoded by the human cDNA of clone 1 05-037-2-0-H 1 1 -CS-SD. 
Further included in the invention are polynucleotides encoding said polypeptides. Preferred 
polynucleotides are those of SEQ ID NO: 8 and of the human cDNA of clone 1 05-037-2-0-H 1 1 -CS- 
SD. 

25 Nucleotide sequences encoding the polypeptides of SEQ. ID. NOs: 249 and 288 can be used 

to generate probes for the detection of related genes. Vectors expressing the nucleotide sequence 
can be used can be used to express the polypeptide in target cells. Antisense nucleotides can be 
used to inhibit the expression of the polypeptide. 

Thus, in another embodiment of the invention the protein or a fragment thereof can be used 

30 as a marker protein to selectively identify tissues, preferably salivary glands. For example, the 
protein of the invention or a fragment may be used to synthesize specific antibodies using any 
techniques known to those skilled in the art. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 
has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 

35 section using immunochemistry. In another embodiment, polynucleotides encoding the protein of 
SEQ. ID. NO. 249 can be used for in situ hybridization. 
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The transcript coding for the protein gng31g (Genebank accession number AF069954) is 
transcribed from a bidirectional promoter divergently with the transcript coding for the gamma 3 
subunit protein called gng3, a novel human G binding protein gamma-3 (HGPG) (Genbank 
accession number AF069953) and this organization is conserved across species within the human 
5 genome {Dowries et al, Genomics, 53:220-230 (1998)). 

Several genes which are linked in common physiological functions share a common 
divergently bidirectionnel promoter like aB crystallin and a crystallin, collagen type IV Al and A2, 
surfl-3, and surf 5 genes, dihydrofolate reductase and 2 mismatch repairl (Iwaki et al. Genomics, 
45: 386-394 (1997), Burbelo et al, Proc. Natl. Acad. Sci. USA 85: 9679-9682 (1988), Kaytes et al, 

10 J. Biol Chem, 263: 19274-19277 (1988), Poschl et al, EMBOJ., 7:2687-2695 (1988), Soininen et 
al, J. Biol Chem, 263: 17217-17220 (1988), Garson et al, Genomics, 30: 163-170 (1995), 
Williams et al, Mol. Cell Biol, 6: 4558-4569 (1986), Fujii et al, J. Biol Chem., 264: 10057-10064 
(1989)). The heterodimeric G proteins, a family of GTPases are present in all cells and control a 
variety of functions (metabolic, humoral, neural and developmental) by transducing hormonal, 

15 neurotransmitter and sensory signals into an array of cellular responses. Triggered by cell surface 
receptors, each G protein regulate the activity of a specific effector including adenylate cyclase, 
phospholipase C, and ion channels protein which initiate appropriate biochemical responses. In 
view of this, it is believed that the transcript coding for the proteins of SEQ ED NO: 249 shares 
common regulatory elements with gng3 gene and that the products of such genes which are protein 

20 of SEQ ED NO: 249 and gng3 are physiologically coupled in unknown ways. Thus, in an 
embodiment of the invention, the protein of SEQ ID NO: 249 or part thereof may be used to 
regulate signal transduction of hormonal, neurotransmitter, and sensory signals to provide an array 
of cellular responses. 

In yet another embodiment of the invention, the polypeptides of the present invention and 
25 the related polynucleotides may be used to treat several types of disorders including, but not limited 
to, cancer, neurodegenerative diseases, cardiovascular disorders, hypertension, renal injury and 
repair, septic shock. 

In one embodiment, the protein of SEQ. ID. NO: 249 or a fragment, derivative or analog 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
30 expression or activity of the protein. 

In a further embodiment, a vector capable of expressing the protein of SEQ. ID. NO: 249 
may be administered to a subject to treat or prevent a disorder associated with decreased expression 
or activity of the protein. Naked nucleotides encoding the protein of SEQ. ID. NO 249 may also be 
used. 

35 In yet another embodiment, a pharmaceutical composition comprising a substantially 

purified the protein of SEQ ID NO: 249 may be administered to a subject to treat or prevent a 
disorder associated with decreased expression of the protein. 
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In still another embodiment, the polypeptide of SEQ ED NO: 249 can be used to develop 
and screen antagonists. For example, purified polypeptide can be used to develop antibodies or to 
screen libraries of pharmaceutical agents to identify those that inhibit the physiological functions of 
the protein. 

5 Thus, in a further embodiment, an antagonist of the protein of SEQ ID NO: 249 can be 

administered to a subject to prevent or treat a disease associated with increased expression or 
activity of the protein. Similarly, in another embodiment a vector expressing the complement of a 
polynucleotide sequence encoding the protein of SEQ ID NO: 249 may be administered to decrease 
the expression of the protein. 

10 The protein of SEQ ID NO: 249 displays a leucine zipper pattern situated near its NH2 

terminal part (position 20 to 41). Thus, it is believed that the protein of SEQ ID NO: 249 is able to 
dimerize either with itself (homo-dimerisation) or with an heterologous protein (hetero- 
dimerisation) of interest, through the mediation of its leucine zipper domain. Preferred polypeptides 
of the invention are polypeptides comprising leucine zipper domains fragments and fragments 

15 having any of the biological activities described herein. The multimerization activity of the protein 
of the invention or part thereof may be assayed using any of the assays known to those skilled in the 
art including circular dichroism spectrum and thermal melting analyses as described in US patent 
5,942,433. The utilities of proteins containing leucine zipper domains, such as the protein of SEQ 
ID No: 249, are described elsewhere in the application. 

20 Protein of SEP ID NO: 259 (internal designation 1 14-016-1-0-H8-CS) 

The protein of SEQ ID NO: 259, herein referred to as HOPP, is encoded by clone 1 14-016- 
1-0-H8-CS (SEQ ID NO: 18). This protein is homologous to a protein of Arabidopsis thaliana 
(ASY1) and Saccharomyces cerevisiae, (HOP1) (Caryl A.P. et al. Chromosoma, 109, 62-71; 
Hollingsworth N.M. et al. Cell, 61, 73-84). 

25 In addition, the 394-amino-acid protein of SEQ ID NO: 259 displays a pfam HORMA 

domain from position 22 to 230. The HORMA domain is a common structural element in mitotic 
checkpoints, chromosome synapsis and DNA repair. For example, the HORMA domain was found 
in: (1) MAD2, a key component of the mitotic-spindle-assembly checkpoint (reviewed in Straiht 
AF. Current Biology 1997, 7:613-616); (2) HOP1, a conserved protein that is involved in meiotic- 

30 synaptonemal-complex assembly (Hollingsworth N.M. et al. Cell, 61, 73-84); and (3) in Rev7p, a 
subunit of the yeast DNA polymerase "epsilon" that is involved in translation, template independent 
DNA synthesis (Aravind L. and Koonin E.V., Trends Biochem Sci. 1998 Aug;23(8):284-6). 

The pairing of homologous chromosomes during meiotic prophase culminates in the 
formation of the synaptonemal complex (SC), which is a ribbon-like, proteinaceous structure that 

35 holds homologous chromosomes in close apposition along their entire length. The synaptonemal 
complex (SC) is a prominent and evolutionally well conserved structure which is strictly meiotic. 
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Evidence from mutant phenotypes supports the hypothesis that recombination and SC formation are 
mutually interdependent processes. First, although not required for homology recognition, the SC 
could promote interhomolog interactions in situations where the normal processes have failed {e.g., 
interlocking, heterologous pairing, etc.). Second, polymerization of the SC components might 
5 permit the recombination process to progress by modulating the number and localization of 

reciprocal versus exchanges (i.e. interference). Third, the SC may play an important role in meiotic 
chromosome structure and especially inter-sister interactions. 

Synapsis of homologous chromosomes is a key event in meiosis as it is essential for normal 
chromosome segregation and is implicated in the regulation of crossover frequency (for review see 
10 Zickler D., J Soc Biol 1999;193(l):17-22). Mutants in HOP1 and ASY1, both proteins having 
significant homology to the protein of SEQ ID NO: 259, display decreased levels of meiotic 
crossover and intragenic recombination between markers on homologous chromosomes 
(Hollingsworth N.M., Byers B., Genetics 1989 Mar;121(3):445-62 ; Caryl AP et al. Chromosoma 
2000;109(l-2):62-71). 

15 Thus, the invention relates to methods and compositions using the protein encoded by clone 

1 14-01 6-1 -0-H8-CS or polynucleotide of SEQ ID NO: 18, or biologically active fragments thereof, 
to restore normal chromosome segregation in cells by administration of compositions comprising 
HOPP polypeptide, or polynucleotide encoding a HOPP polypeptide, encoded by clone 114-016-1- 
0-H8-CS, or polynucleotide in therapeutically effective amounts. The loss of normal chromosome 

20 segregation in normal cells leads to aberrant chromosome segregation events, a hallmark of tumor 
progression. HOPP proteins, encoded by clone 1 14-01 6-1 -0-H8-CS, can be targeted to the nucleus 
by nuclear targeting sequences according to well-known methods. Nuclear targeting sequences (or 
NLS) can be chemically or recombinantly attached to HOPP. Alternatively, the HOPP gene can be 
used in known gene therapy protocols to restore normal chromosome function. 

25 Infertility due to gametogenic failure is frequently associated with structural autosomal 

abnormalities. Recent meiotic studies, at pachytene stage, have shown a failure around the 
breakpoints, an association of the translocation figure with the sex chromosomes, and the frequent 
involvement of the acrocentric chromosomes. Two main models are proposed to explain the male 
sterilizing effect of rearrangements. The impairment of spermatogenesis could be the result of: 1) 

30 the XY-autosome interaction; or 2) the disruption around the breakpoints at the pachytene stage. 
These defects may contribute significantly to germ-cell atresia (for review see Luciani JM, 
Guichaoua MR Reprod Nutr Dev 1990; Suppl l:95s-103s and Miklos GL. Cytogenet Cell Genet. 
1974;13(6):558-77). Thus the subject invention also relates to methods and compositions of using 
the protein of SEQ ID NO: HOPP or clone 1 14-01 6-1 -0-H8-CS, or biologically active fragments 

35 thereof, to reduce the incidence of infertility due to gametogenic failure. HOPP can be introduced 
into sperm as described in the preceding paragraphs. HOPP, optionally joined to a NLS sequence, 
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can also be introduced into sperm or eggs by other methods well known in the art (such as 
electroporation or microinjection). 

The protein of SEQ ID NO: 259, encoded by clone 1 14-016-1-0-H8-CS, can also arrest cell 
division in human cells if the mitotic spindle apparatus is improperly attached to the chromosomes 
5 (Allshire R.C. Current Opinion in Genetics and Development 1997, 7:264-273). In the absence of 
functional protein of SEQ ID NO: 259, cells exposed to drugs which inhibit the formation of a 
mitotic spindle, such as benomyl, vinblastine, nocodozole, etc. would be expected to undergo rapid 
cell death due to massive chromosome loss. Human cells containing HOPP would be expected to 
survive such drug treatment because they are able to stop dividing prior to the chromosome loss 

10 event. Tumor cells that are hypersensitive to chemotherapeutic agents, which inhibit the formation 
of the mitotic spindle, may be sensitive to these drugs because they are defective in the checkpoint 
protein. Thus, screening assays for the presence or absence of the protein in a given tumor would 
provide an indication of the chemosensitivity of a particular tumor. The present invention therefore 
includes methods of determining prognostic benefit of treating a patent with a chemotherapeutic 

1 5 agent or determining which chemotherapeutic agent from a group of at least two would a patient 
more likely benefit from. Furthermore, the loss of checkpoint function in a normal cell may 
predispose that cell to aberrant chromosome segregation events, a hallmark of tumor progression. 
Thus the antibodies, polypeptides and polynucleotides of the present invention would be useful in 
diagnosing particular cancers. 

20 Polyclonal antibodies can be produced by injecting a host animal such as rabbit, rat, goat, 

mouse or other animal with an immunogen of this invention. The sera is extracted from the host 
animal and is screened to obtain polyclonal antibodies which are specific to the immunogen. 
Methods of screening for polyclonal antibodies are well known to those of ordinary skill in the art 
such as those disclosed in Harlow & Lane, Antibodies: A Laboratory Manual, (Cold Spring Harbor 

25 Laboratories, Cold Spring Harbor, N.Y.: 1988) the contents of which are hereby incorporated by 
reference. 

The monoclonal antibodies can be produced by immunizing, for example, mice with an 
immunogen according to the invention. Methods of producing monoclonal antibodies are well- 
known in the art and include those methods Kohler, B. and Milstein, C, Nature (1975) 256: 495- 

30 497. Hybridomas can be expanded, if desired, and supernatants can be assayed by conventional 
immunoassay procedures, for example radioimmunoassay. Positive clones can be further 
characterized. Hybridomas that produce the desired antibodies can be grown in vitro or in vivo 
using known procedures. The monoclonal antibodies can be isolated by conventional 
immunoglobulin purification procedures such as ammonium sulfate precipitation, gel 

35 electrophoresis, dialysis, affinity chromatography, and ultrafiltration. 

Antibodies of the invention can be labeled with a detectable moiety. As noted above, a 
"detectable moiety" is well known to those of ordinary skill in the art and include, but are not 
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limited to, a fluorescent label, a radioactive atom, a paramagnetic ion, biotin, a chemiluminescent 
label or a label which can be detected through a secondary enzymatic or binding step. 

The invention further provides a method of determining the susceptibility of a tumor sample 
to treatment with a mitotic spindle inhibitor which comprises steps of: a) contacting the tumor 
5 sample with an antibody, wherein the antibody is labeled with a detectable moiety and is capable of 
specifically binding to the protein of invention, or a fragment thereof, and b) assaying for the 
presence of an immunocomplex formed in step (a). The absence of the immunocomplex indicates 
that the tumor would be susceptible to treatment with a mitotic spindle inhibitor such as benomyl, 
vinblastine, and nocodozole. 

10 The subject invention also provides a pharmaceutical composition comprising nucleic acid 

encoding the protein of SEQ ED NO: 259 and a carrier. In one aspect of this invention, the 
compositions are capable of passing through a cell membrane and provide for the expression of the 
protein of invention. As used herein, the term "carrier" includes pharmaceutical^ acceptable 
carriers and encompasses any of the standard pharmaceutical^ accepted carriers, such as phosphate 

15 buffered saline solution, water, emulsions such as an oil/water emulsion or a triglyceride emulsion, 
various types of wetting agents, tablets, coated tablets and capsules, carriers contain excipients such 
as starch, milk, sugar, certain types of clay, gelatin, stensic acid, talc, vegetable fats or oils, gums, 
glycols, or other known excipients. Flavor and color additives or other ingredients can also be 
included. In addition to the standard characteristics of the pharmaceutical^ acceptable carriers, the 

20 "suitable" carriers of the subject can also include those carriers which are able to penetrate the cell 
membrane. Therefore in one embodiment of the pharmaceutical composition the pharmaceutical^ 
acceptable carrier binds to a receptor on a cell capable of being taken up by the cell after binding to 
the structure. 

This invention further provides a method of suppressing tumor formation in a subject which 
25 comprises administering a nucleic acid encoding the protein of invention in an amount effective to 
enhance expression of this protein. 

Proteins of SEP ID NOs: 31 1 and 312 (internal designation 188-28-4-0-B12-CS.corr and 188-28-4- 

0-B 1 2-CS.fr respectively) 

The 466-amino-acid-long protein of SEQ ED NO: 311, encoded by the human cDNA of 
30 clone 188-28-4-0-B12-CS or the cDNA nucleotide sequence of SEQ ID NO: 70, is related to 

proliferating-cell nucleolar antigen pl20 (Genbank accession number M321 10) encoded by noil; 

and the yeast nucleolar protein Nop2p coded by nop2 (Genbank accession number U 12141). SEQ 

ID NO: 311 (encoded by clone 188-28-4-0-B12-CS.corr) shows strong homology with three 

proteins described as homologs of pi 20 (Genbank accession number AK002229 and Geneseqp 
35 accession numbers: Y86441, Y86442). In addition, the protein of SEQ ID NO: 311 is a 

polymorphic variant of SEQ ID No: 312 encoded by the cDNA of SEQ ED No: 71. 
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In addition, the protein of the invention exhibits the pfam NOLl/NOP2/sun family signature 
from positions 201 to 276. This motif is also found by emotif from positions 230 to 245. The 
NOLl/NOP2/sun family include pi 20 and Nop2p. These proteins are involved in nucleolar 
structure and activity as well as the regulation of cell cycle. 
5 Freeman J.W.etal. (Cancer Res. 48: 1244-51, 1988) identified pi 20, a 120-kD nucleolar 

antigen associated with proliferating cells. This protein is a proliferation-associated antigen that is 
temporally regulated during the cell cycle and demonstrates a dramatic increase in expression at the 
Gl-S boundary. This suggests that pi 20 can play a role in the regulation of the cell cycle and the 
increased nucleolar activity that is associated with cell proliferation (Fonagy A. et al. (4993) J. Cell. 

10 Physiol. 154:16-27). 

The human pi 20 protein is also the most cancer specific of the identified proliferation- 
associated nucleolar proteins. Antigen pi 20 was detectable in a broad range of human malignant 
tumors but not in benign rumors or corresponding normal tissues. The antigen was not detectable in 
growth-arrested cells but was expressed early in Gl of the cell cycle. 

15 Overexpression of human pi 20 leads to the transformation of NIH 3T3 cells. Expression of 

antisense pi 20 constructs causes the pl20-transformed cells to revert to their original phenotype. 
Perlaky L. et al. [Cancer Res. 52: 428-36 (1992)] and Valdez et al. [Cancer Res. 52: 5681-87 
(1992)] reported that the middle region of antisense pl20 RNA inhibited proliferation of NIH 3T3 
cells to approximately the same extent as the full-length antisense construct. The predicted mouse 

20 and human PI 20 proteins are 63% identical. 

Another protein of the Noll/Nop2/Sun family, Nop2p, coded by the gene NOP2, has a role 
in nucleolar function during the onset of growth and in the maintenance of nucleolar structure (de 
Beus et al. (1994) J. Cell Biol. 127:1799-813). The two proteins, pl20 and Nop2p, are associated to 
ribosomal RNA in pre-ribosomal particles and can mediate the maturation process of the ribosome 

25 (Hong B. et al. (1997) Mol. Cell Biol. 17:378-88; Gustafson W.C. et al. (1998) Biochem. J. 
331:387-93). 

The subject invention provides the polypeptides encoded by the human cDNA of clone 188- 
28-4-0-B12-CS and polynucleotide sequences encoding the same amino acid sequences. Also 
included in the invention are biologically active fragments of the protein encoded by the human 

30 cDNA of clone 1 88-28-4-0-B 12-CS and polynucleotide sequences encoding these biologically 
active fragments. "Biologically active fragments" are defined as those peptide or polypeptide 
fragments having at least one of the biological functions of the full length protein (e.g., the ability to 
transform cell lines in vitro.). 

The invention also provides variants of the protein of SEQ ED NO: 311, encoded by clone 

35 1 88-28-4-0-B 1 2-CS. These variants have at least about 80%, more preferably at least about 90%, 
and most preferably at least about 95% amino acid sequence identity to the amino acid sequence 
encoded by clone 1 88-28-4-0-B 12-CS. Variants according to the subject invention also have at 
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least one functional or structural characteristic of the protein encoded by clone 188-28-4-0-B12-CS. 
The invention also provides biologically active fragments of the variant proteins. Unless otherwise 
indicated, the methods disclosed herein can be practiced utilizing the protein encoded by clone 188- 
28-4-0-B12-CS, or clone 188-28-4-0-B12-CS, or variants thereof Likewise, the methods of the 
5 subject invention can be practiced using biologically active fragments of the protein encoded by 
clone 188-28-4-0-B12-CS , clone 188-28-4-0-B12-CS, or variants of said biologically active 
fragments. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence provided by clone 188-28-4-0-B12-CS. It is well within the skill 

10 of a person trained in the art to create these alternative DNA sequences encoding proteins having 
the same, or essentially the same, amino acid sequence. These variant DNA sequences are, thus, 
within the scope of the subject invention. As used herein, reference to "essentially the same" 
sequence refers to sequences that have amino acid substitutions, deletions, additions, or insertions 
that do not materially affect biological activity. Fragments retaining one or more characteristic 

15 biological activity of the protein encoded by clone 188-28-4-0-B12-CS are also included in this 
definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
code. Various codon substitutions, such as the silent changes which produce specific restriction 

20 sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

In one aspect of the subject invention, SEQ ID NO: 311, encoded by clone 188-28-4-0-B12- 
CS, and variants thereof, can be used to generate polyclonal or monoclonal antibodies. Both 
biologically active and immunogenic fragments of the amino acid sequence or variant proteins can 

25 be used to produce antibodies. Polyclonal and/or monoclonal antibodies can be made according to 
methods well known to the skilled artisan. Antibodies produced in accordance with subject 
invention can be used in a variety of detection assays known to those skilled in the art. Another 
aspect of this invention provides monoclonal and polyclonal antibodies which do not cross-react 
with known pi 20 proteins. 

30 In one embodiment, the protein encoded by clone 188-28-4-0-B12-CS, variants of said 

protein, and biologically active fragments of the protein or said variants can be used as a nucleolar- 
fraction marker in nuclear fractionation studies or as a marker of pre-ribosomal particles, in 
methods well known to the skilled artisan. 

In another embodiment, the protein encoded by clone 188-28-4-0-B12-CS, variants of said 

35 proteins, and biologically active fragments thereof, can be used as a proliferation marker in 

neoplastic cells. Alternatively, quantitative immunoassays can be used to assess the levels of the 
protein in resected cancerous and normal tissues. Alternatively, levels of the protein encoded by 
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clone 188-28-4-0-B12-CS can be compared between an individual and a "normal" control group as 
a prognostic indicator of malignancy. Further, the relationship between protein expression and cell 
proliferation can be assayed using cancer cell lines. Thus, the protein of the invention or part thereof 
can be used as a marker for proliferation in human cancer cells in vivo and in vitro. If the absence of 
5 expression of the protein of the invention on normal and benign tumors is confirmed, it could serve 
as a marker of malignant cancer cells. The proliferation rate of cancer cells can be also determined 
by quantitative analysis of the expression of the protein encoded by 188-28-4-0-B12-CS, or 
biologically active fragments thereof, according to methods described in Trere D. et al. (J. Pathol. 
192:216-20,2000). 

10 The transforming activity of the protein of the invention can be assayed as described in 

Perlaky et al. (Anticancer Drug Des. 8:3-14, 1993). Thus, polynucleotides encoding the (188-28-4- 
0-B12-CS) protein can be used to induce transformation on NIH/3T3 cells in vitro. Alternatively, 
the polynucleotide encoding (188-28-4-0-B12-CS) can be used to provide antisense 
oligonucleotides useful in antisense therapeutic protocols according to methods known in the art. 

15 Protein of SEP ID NO: 406 (internal designation 174-32-4-0-F8-CS) 

The 378-amino-acid-long protein of SEQ ID NO: 406 encoded by the cDNA of SEQ ID 
NO: 165 is expressed in tissues such as colon, prostate and salivary glands and overexpressed in 
colon and prostate. The C-terminus of the protein of the invention is homologous to the human 
retinoblastoma-binding protein, RbAp48 (Qian YW et al. (1993) Nature 364:648-652, GenBank 

20 accession number: X74262) and to its homologues conserved in other organisms including mouse 
(GenBank accession number: Q60972) and C. elegans (GenBank accession number: AF1 16530). 
The protein of the invention contains also two internal WD-repeat clusters (Prosite PS00678, amino 
acid positions 267 to 304 and positions 333 to 370, respectively), a structural motif involved in 
proteins interaction in signal transduction pathway and transcription regulation (Neer EJ et al. 

25 (1994) Nature 371:297-300; Neer EJ et al (1996) Cell 84:175-178). 

The retinoblastoma protein (Rb) is the product of the retinoblastoma gene. Deletion or 
inactivation of both Rb alleles is essential in the formation of human retinoblastoma in both 
hereditary and sporadic forms (Benedict WF et al. (1983) Science 219: 973-975). 

Loss-of-function mutations in the Rb gene is also found in many other tumor types, 

30 including osteosarcoma, breast carcinoma, small cell lung carcinoma, bladder carcinoma, prostate 
carcinoma and soft tissue sarcoma (Bookstein R et al. (1991) Crit. Rev. Oncog. 2:21 1-227). 
Introduction of the wild-type Rb gene into cultured retinoblastoma cells suppresses cells growth and 
their tumorigenicity in nude mice (Huang HJ et al (1988) 242:1536-1566). Expression of normal 
Rb protein in prostate carcinoma, osteosarcoma, breast carcinoma and bladder carcinoma cells also 

35 suppresses their neoplastic phenotype, thus establishing the Rb gene as a tumor suppressor 
(Reviewed by Weinberg RA (1991) Science 254:1 138-1 146). 
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It has been shown that the Rb gene product is a nuclear phosphoprotein that undergoes 
cyclic phosphorylation and dephosphorylation during cell cycling. Rb is unphosphorylated or 
"underphosphorylated" during early Gl phase, and become phosphorylated just before S phase, and 
remains phosphorylated until late mitosis. Injection of unphosphorylated Rb protein into cell during 
5 early Gl phase inhibits the entry into S phase, suggesting that some of the growth suppressor 
functions of Rb may be carried out by the underphosphorylated form of Rb (Goodrich et al. (1991) 
Cell 67:293-302; Hinds PW et al. (1992) Cell 70:993-1006). Rb protein not only regulates cell 
cycle, but is also involved in cell differentiation. For example, lens epithelial cells in Rb-deficient 
mouse fail to terminally differentiate and undergo apoptosis (Morgenbesser et al.(1994) Nature 
10 371:72-74). 

It has been demonstrated that Rb protein inhibits cellular growth and proliferation through 
interactions with multiple cellular proteins that interfere with these cellular protein's downstream 
actions. For example, Rb protein is able to form specific complexes with transcriptional factor E2F, 
which regulates the expression of a set of genes essential for the Gl to S phase transition (Nevin JR et 

1 5 al (1 992) Science 258:424-429). The Rb protein restrains cell cycle progression by masking the E2F 
transactivation domain and by blocking the interaction of surrounding enhancer elements and basal 
transcription complex (Weintraub SJ et al (1995) Nature 375:812-815). Association of Rb and UBF, a 
ribosomal transcription factor, results in suppression of the synthesis of ribosomal RNA by RNA 
polymerase I (Cavanaugh LI et al (1995) Nature 374:177-180; Mancini M et al (1994) 

20 Proc.Natl.Acad.Sci.USA 91 :4 18^22). 

RbAp48 was first identified as a major protein from Hela cell that binds to a putative 
functional domain at the C-terminus of the Rb protein. Only unphosphorylated and 
hypophosphorylated forms of the Rb protein were coprecipitated with RbAp48. Like Rb protein, 
RbAp48 is a ubiquitously expressed nuclear protein that shares sequence homology with MSI1, a 

25 negative regulator of the Ras-cAMP pathway in the yeast Saccharomyces cerevisiae. Overexpression of 
RbAp48 can convert the yeast mutant strains from heat-shock sensitivity to heat-shock resistant, similar 
to the result obtained from MSI1 overexpression. Thus, the human RbAp48 is a functional homologue 
of MSI1 (Qian YW et al. (1993) Nature 364:648-652). 

Rbap48 protein was later found to be the p48 subunit of mammalian chromatin assembly factor 

30 1 (CAF-1) and to be present in histone deacetylase complex (Parthum MR et al (1996) Cell 87:85-94). 
CAF-1 from human cell nuclei consists of three subunits of pl50, p60 and p48 and is involved in 
assembling of histone3 and histone4 onto nascent nucleosome structure during DNA replication in S 
phase (Kaufman FD et al (1995) Cell 81:1 105-1 1 14). Indeed, some transcriptional repressors function 
through the recruitment of the histone deacetylase complex (HDAC), the latter acts by acetylating or 

35 deacetylating the tail protruding from the core histones, thereby modulating the local structure of 
chromatin (Reviewed by Pazin MJ et al (1 997) Cell 89:325-328). Rb protein recruits HDAC for 
binding to E2F to repress transcription (Brehm A et al (1998) Nature 391 :597-601; Magnaghi-Jaulin L 

226 



BNSDOCID: <WO 0142451 A2_L> 



WO 01/42451 PCT7IB00/01938 

et al (1998) Nature 391 :601-604). It was also reported that the p48 subunit of chicken CAF-1 can bind 
to chicken HDAC in vitro through interaction of WD-40 repeats presented in both protein sequences 
(Ahmad A et al (1999) J.Biol.Chem. 274:16646-16653). 

The WD-40 protein family is characterized by the repetition of a loosely conserved repeat of 
5 approximately 40 amino acids, each repeat being separated from each other by a Trp-Asp dipeptide 
sequence. The conserved core of this repeat, which usually ends with the amino acids Trp-Asp (WD), 
was first identified in the beta-subunit of the heterotrimeric GTP-binding protein, G-protein (Fong H et 
al. (1986) Proc.Natl.Acad.Sci.USA 83:2162-2166). Among the WD-40 proteins identified to date, none 
are enzymes, and all seem to have regulatory functions (Neer, E. J. et al. (1994) Nature 371:297-300). 

10 A number of WD repeat proteins have been localized to the nucleus and function in the repression of 
transcription. These include Tupl, Hirl, and Met30 in S. cerevisiae; SCON2 in Neurospora crassa; 
extra sex combs and Groucho in Drosophila; COP1 in Arabidopsis thaliana; and HIRA and the family 
of TLE proteins in humans. These WD-40 proteins turn off a wide variety of genes, including those 
involved in segmentation, sex determination, and neurogenesis (controlled by Groucho) and those 

1 5 involved in photomorphogenesis (controlled by COP1). All of these WD-40 containing proteins have 
been proposed to fold into propellers in which the internal beta-strands form a rigid skeleton that is 
fleshed out on the surface by specialized loops to which other proteins bind (Lambright DG et al (1996) 
Nature 379:311-319; Sondex J et al (1996) Nature 379:369-374). 

Thus, discovery of new Rb-binding proteins is necessary to design methods of regulating cell 

20 growth and block tumorigenesis through the control of tumor suppressor proteins in their interaction 
with oncogene products and may provide new compositions which are useful in the diagnosis, 
prevention and treatment of cancer and developmental disorders. 

It is believed that the protein of SEQ ED NO: 406 or part thereof plays a role in the control of 
gene expression, probably as a transcription repressor. The protein of the invention is thought to be able 

25 to bind to other proteins, preferably to nuclear proteins, more preferably to Rb. Preferred polypeptides 
of the invention are polypeptides comprising fragments of SEQ ID NO: 406 from position 159-373, 
267-304 and 333-370. Other preferred polypeptides of the invention are polypeptides comprising 
fragments of SEQ ID NO: 406 having any of the biological activity described herein. The ability of the 
protein of the invention or part thereof to function as a transcription repressor may be assessed using 

30 techniques well known to those skilled in the art including those described previously (Weintraub SJ 
(1995) Nature 375:812-815; Qian YW (1995) JJBiol.Chem. 270:25507-25513). The ability of the 
protein of the invention or part thereof, especially fragments containing WD-repeats, to bind to other 
proteins may be assessed using techniques well known to those skilled in the art including those 
described herein. For example, the protein of the invention could be used as a "bait" protein in a yeast 

35 double hybridization system (e.g. Gal-4-based system from Clontech) to isolate and eventually to 
identify its interacting protein partner in vivo from a cDNA library. Alternatively, the protein of the 
invention or part thereof can be used either in a pure form or in a fusion form (linked to a reporter gene 
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product, such as alkaline phosphatase) to screen a phage cDNA expression library derived from 
selected tissues or cell types of a given organism (Scott et al (1990) Science 249:386-390; Lam et al 
(1992) Nature 354:82-84). Preferably, the binding ability of protein of the invention is tested in 
mammalian cell transfection experiments. When fused in-frame to a suitable peptide tag in expression 
5 vector, such as [His]6 in the pRset expression plasmid vector (Invitrogen) and introduced into culture 
cells, the proteins that bind to the expressed fusion protein can be immunoprecipitated using anti-[His] 6 
antibody. This approach can also be employed to confirm the findings obtained from either yeast 
double hybridization system or in vitro phage peptide library screening. In this case, the putative 
interacting partner protein will be fused to a distinct tag in a second expression vector and co- 

10 transfected into culture cells. True binding complex will be co-immunoprecipitated with the two 

different anti-tag antibodies. In a particular embodiment, an affinity chromatography method is carried 
out to identify the interacting protein partners in vitro from cell lysates as performed for the 
identification of the RbAp48 protein (Qian YW et al. (1993) Nature 364:648-652). 

An embodiment of the present invention relates to methods of using the protein of the 

1 5 invention or part thereof, particularly polypeptides containing WD-motifs, or derivative thereof to 
identify and/or quantify binding proteins, preferably nuclear proteins, more preferably Rb, in a 
biological sample, and thus used in assays and diagnostic kits for the quantification of such binding 
proteins in bodily fluids, in tissue samples, and in mammalian cell cultures. Such assays may be 
particularly useful as diagnostic or prognostic tools in the detection and monitoring of a disorder 

20 linked to dysregulation of expression of a transcription regulator. Such assays may thus be very 
useful to asses the level of the tumor suppressor Rb in disorders including but not limited to 
developmental disorders, cancers such as retinoblastoma, prostate carcinoma, osteosarcoma, breast 
carcinoma and bladder carcinoma. The binding activity of the protein of the invention or part 
thereof may be assessed using any method familiar to those skilled in the art. Preferably, a defined 

25 quantity of the protein of the invention or part thereof is added to the sample under conditions 
allowing the formation of a complex between the protein of the invention or part thereof and the 
binding protein to be identified and/or quantified. Then, the presence of the complex and/or or the 
free protein of the invention or part thereof is assayed and eventually compared to a control using 
any of the techniques known by those skilled in the art. 

30 Another embodiment of the present invention relates to compositions and methods using the 

protein of the invention or part thereof or derivative thereof to block gene transcription either in vitro or 
in vivo. In a preferred embodiment, the protein of the invention or part thereof or derivative thereof is 
added in an effective amount to an in vitro culture to inhibit gene expression and thus cell proliferation 
using molecular biology techniques known to those skilled in the art allowing the import of the protein 

35 from the extracellular medium to the cell's nucleus. In another embodiment, eukaryotic cells are 
genetically engineered in order to express the protein of the invention or part thereof under specific 
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conditions in order to prevent further proliferation of such cells upon demand such as infection, 
transformation, activation, differentiation, end of a production process. 

A preferred embodiment of the invention relates to compositions or methods using SEQ ID 
NO: 406, SEQ ID NO: 165 or part thereof to diagnose, treat and/or prevent disorders including but 
5 not limited to disorders linked to dysregulation of gene transcription such as cancers and other 
disorders relating to abnormal cellular differentiation, proliferation, or degeneration, including 
hyperaldosteronism, hypocortisolism (Addison's disease), hyperthyroidism (Grave's disease), 
hypothyroidism, colorectal polyps, gastritis, gastric and duodenal ulcers, ulcerative colitis, and 
Crohn's disease; metabolic diseases such as obesity and a number of inflammatory diseases due to 

10 interleukin over-expression. For diagnostic and prognostic purposes, the expression of the protein of 
the invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 
methods described herein and compared to the expression in control individuals. For prevention 
and/or treatment purposes, the protein of the invention may be overexpressed using any of the gene 
therapy methods known to those skilled in the art including those described herein. For example, 

15 expression of the protein of the invention can be upregulated by infecting tumor cells with a 

retroviral or an adenoviral vector which expresses the desired protein at higher levels necessary for 
suppression of mutation in the Rb gene or in other oncogenic or tumor suppressor genes. 

Another related embodiment relates to the use of SEQ ID NO: 406, SEQ ID NO: 165, its 
complement, or any part thereof to develop antagonists of the protein of the invention. These 

20 antagonists could be antisense oligonucleotides, triple helices, ribozymes, small molecules or 

antibodies, especially neutralizing antibodies binding to the WD-repeats of the invention, and may 
be used to treat disease and conditions caused by abnormally low transcription. These conditions 
include accelerated aging syndromes such as Cochayne's syndrome, Ataxia telangiectasia and 
Werner's syndrome as well as age-associated diseases as well as "early onset" forms of diseases 

25 associated with old age such as dementia and Parkinson's disease. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the invention or part thereof as a marker protein to selectively identity tissues, preferably colon 
and prostate. For example, the protein of the invention or part may be used to synthesize specific 
antibodies using any techniques known to those skilled in the art including those described therein. 

30 Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 
forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 

Protein of SEP ID NO: 414 (internal designation 188-27-3-0-Gl-CS^ 

The 389 amino-acid long protein of SEQ ID NO: 414, expressed in brain, fetal brain, 
35 placenta and testis, over-expressed in brain and encoded by the cDNA of SEQ ID NO: 173 is 

homologous to SIRTUIN-2 (SERT2) (Trembl accession number: Q9Y6E9) and Silent Information 
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Regulator 2-like protein (SIR2L) (Trembl accession number: Q9UNT0) that belong to the Silent 
Information Regulator type 2 (SIR2) family. In addition, the protein of the invention presents the 
Pfam signature for members of the SER2 family (amino acids 84-268). Furthermore, the protein of 
the invention displays two characteristic motifs highly conserved among all members of the SIR 2 
5 family that have been shown to be essential in the SIR2 silencing function (Moira M. et al., 

Genetics, 154:1069-1083 (2000)). These motifs are from positions 84 to 98 and from positions 165 
to 170 of the protein of SEQ ID NO: 414 and correspond to GAG(l/V)SxxxG(I/V)PDFERS and 
(Y/I)TQNID patterns respectively. The protein of SEQ ID NO: 414 also has conserved cysteines 
residues at positions 195, 200, 221 and 224, covering a domain thought to be either a DNA-binding 

1 0 zinc-fmger motif (Prodom prediction PD002659, from positions 1 95 to 224) or an enzymatic 
domain (or an enzyme cofactor) (Moira M. et al., Genetics, 154:1069-1083 (2000)). 

The cDNA of SEQ ID NO: 173 encoding the protein of the invention differs from the one 
encoding the SIRT2 protein by a supplementary exon between positions 147 to 195. This exon 
modifies the initiation codon of the protein and extends the ORF in its N-terminal part by 16 amino 

15 acids. Moreover, amino residues in positions 20 and 21 of the protein of the invention (respectively 
an alanine and a glutamine residue) are substituted from a glutamine and a tyrosine residue 
(positions 4 and 5) of the SIRT2 protein. Thus, the protein of the invention is a new isoform of 
SIR2 resulting from alternative splicing. The protein of the invention of SEQ ID NO: 414 is also 37 
amino acids longer than the SIRL2 protein at its N-terminai end. 

20 Regulation of gene expression by alterations in chromatin structure is a universal 

mechanism in eukaryotic cells, responsible for maintaining patterns of gene expression throughout 
the development of multicellular organisms. Silencing has been studied most extensively in S. 
cerevisae (yeast). Among the SIR genes, SIR2 is the most evolutionarily conserved, and a number 
of genes with homology to SIR2 have been identified (Frye R et al., Biochem. Biophys. Res. 

25 Commun., 273:793-798 (2000)). Presence of Homologues of SIR2 (HSTs) in organisms from 
bacteria to humans suggests that SIR2's silencing mechanism might be conserved. SIR2 was 
originally discovered to influence ma ting-type control in haploid cells by locus-specific 
transcriptional silencing. It has also been suggested that SIR2 and its homologs play additional 
roles in suppression of recombination, chromosomal stability, metabolic regulation, meiosis, and 

30 aging (for a review: see Gartenberg, Curr. Opin. Microbiol. 3:132-137 (2000)). 

Proteins of the SIR2 family are also thought to be either enzymes or enzyme cofactors. 
First, Landry and collaborators have shown that members of SIR2 family catalyze histone 
deacetylation in a reaction that requires NAD, thereby distinguishing them from previously 
characterized deacetylases. This enzyme is active on histone substrates that have been acetylated by 

35 both chromatin assembly-linked and transcription related acetyl transferases (Landry et al., Proc. 
Natl. Acad. Sci. 97:5807-581 1 (2000)). Discovery of an intrinsic deacetylation activity for the 
conserved SIR2 family provides a mechanism for modifying histones and other proteins to regulate 
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transcription and diverse biological process. Secondly, the study of a human SIR2 family member 
(hSirT2) was found to have a mono-ADP ribosylation activity in vitro (Frye R et al., Biochem. 
Biophys. Res. Commun., 260: 273-279 (1999)). Among potential substrates for mono-ADP 
ribosylation are histones and RNA Pol I, modification of which correlates with enhanced rDNA 
5 transcription. 

It is believed that the protein of SEQ ID NO: 414 or part thereof plays a role in gene 
silencing, suppression of recombination, chromosomal stability, metabolic regulation, meiosis, and 
aging, probably as a member of the SIR2 protein family. Particularly, the protein of the invention 
may deacetylate substrates, preferably acetylated histones and acetyltransferases, either directly or 

10 indirectly as enzyme cofactors. Particularly, the protein of the invention may have a ribosylation 
activity, preferably a mono-ADP ribosylation activity, preferably on histones and RNA Pol I 
substrates, either directly or indirectly as enzyme cofactors. Additionally, the protein of the 
invention may be a DNA binding protein. Preferred polypeptides of the invention are polypeptides 
comprising the amino acids of SEQ ID NO: 414 from positions 1 to 16, 84 to 98, 165 to 170, 195 to 

15 224, and 84-268 as well as fragments of SEQ ID NO: 414 containing at least one cysteine residues 
located in positions 195, 200, 221 or 224 of SEQ ED NO: 414. Other preferred polypeptides of the 
invention are fragments of SEQ ED NO: 414 having any of the biological activities described herein. 
The deacetylation activity of the protein of the invention or part thereof may be assayed using any 
of the assays known to those skilled in the art including those described in Laundry et al., supra. 

20 The ribosylation activity of the protein of the invention or part thereof may be assayed using any of 
the assays known to those skilled in the art including those described in Frye et al, (1999). The 
nucleic acid binding activity of the protein of the invention or part thereof may be assayed using any 
of the assays known to those skilled in the art including those described in US patent 6,013,453. 

The invention relates to methods and compositions using the protein of the invention or part 

25 thereof to silence gene expression. In a preferred embodiment, the protein of the invention or part 
thereof or derivative thereof is added in an effective amount to an in vitro culture to inhibit gene 
expression and thus cell proliferation using molecular biology techniques known to those skilled in 
the art allowing the import of the protein from the extracellular medium to the cell's nucleus. In 
another embodiment, eukaryotic cells are genetically engineered in order to express the protein of 

30 the invention or part thereof under specific conditions in order to prevent further proliferation of 
such cells upon demand such as infection, transformation, end of a production process, 
differentiation, etc... 

The invention relates to methods and compositions using the protein of the invention or part 
thereof to deacetylate substrates, alone or in combination with other substances. Such substrates are 
35 acetylated substrates, preferably acetylated histones and acetyltransferases. For example, the protein 
of the invention or part thereof is added to a sample containing the substrate(s) in conditions 
allowing deacetylation, and allowed to catalyze the deacetylation of the substrate(s). In a preferred 
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embodiment, the deacetylation is carried out using a standard assay such as those described in 
Laundry et al, supra. Deacetylated histones obtained by this method may be mixed with purified 
naked DNA (plasmid preparations for example) in order to reconstitute chromatine-like structures 
in vitro. Such structures are of great interest in the study of enzymatic factors involved in 
5 transcription and replication. 

Another embodiment of the present invention relates to composition and methods of using 
the protein of the invention or part thereof to develop assays for in vitro screening of inhibitors 
directed against the encoded deacetylase activity using any technique known to those skilled in the 
art including those described herein. Such deacetylase inhibitors are of great potential as new drugs 

10 due to their ability to influence transcriptional regulation and to induce apoptosis or differentiation 
in cancer cells. Preferably, the protein of the invention, expressed in prokaryotic or eukaryotic 
systems according to methods known to those skilled in the art, may be mixed in vitro with a simple 
fluorescent substrate like an aminocoumarin derivative of an acetylated lysine, and different 
putative inhibitors. The coumarin derivative is then quantitated using a reverse-phase HPLC-system 

15 with a fluorescence detector. Such an approach has been previously developed by Hoffmann and 
collaborators (Hoffmann et al., Nucl. Acids Res. 27:2057-2058 (1999); Hoffmann et al., Pharmazie 
55:601-606 (2000)). 

The invention relates to methods and compositions using the protein of the invention or part 
thereof to bind to nucleic acids, preferably DNA, alone or in combination with other substances. 

20 For example, the protein of the invention or part thereof is added to a sample containing nucleic 
acid in conditions allowing binding, and allowed to bind to nucleic acids. In a preferred 
embodiment, the protein of the invention or part thereof may be used to purify nucleic acids such as 
restriction fragments. In another preferred embodiment, the protein of the invention or part thereof 
may be used to visualize nucleic acids when the polypeptide is linked to an appropriate fusion 

25 partner, or is detected by probing with an antibody. Alternatively, the protein of the invention or 
part thereof may be bound to a chromatographic support, either alone or in combination with other 
DNA binding proteins, using techniques well known in the art, to form an affinity chromatography 
column. A sample containing nucleic acids to purify is run through the column. Immobilizing the 
protein of the invention or part thereof on a support advantageous is particularly for those 

30 embodiments in which the method is to be practiced on a commercial scale. This immobilization 
facilitates the removal of the protein from the batch of product and subsequent reuse of the protein. 
Immobilization of the protein of the invention or part thereof can be accomplished, for example, by 
inserting a cellulose-binding domain in the protein. One of skill in the art will understand that other 
methods of immobilization could also be used and are described in the available literature. 

35 Still another embodiment of the present invention relates to composition and methods of 

using the protein of the invention or part thereof to identify genes or regions of the human genome 
silenced by the protein of the invention or part thereof. Genomic DNA derived from patients with 
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pathologies such as cancer and metabolic disorders, or from elderly people may be compared to 
those extracted from respective controls for their ability to bind the protein of the invention. As 
described previously, the protein of SEQ ID NO: 414 displays a putative zinc finger domain 
susceptible to bind DNA sequences near regions or genes to silence. The protein of the invention or 
5 part thereof may be bound to a chromatographic support, using techniques well known in the art, to 
form an affinity chromatography column. A sample containing a mixture of human genomic DNA 
digested by endorestriction enzymes is run through the column. After extensive washings the bound 
DNA is eluted and further subcloned in classical cloning vectors known to those skilled in the art. 
Immobilizing the protein of the invention or part thereof on a support is particularly advantageous 

10 for those embodiments in which the method has to be practiced routinely. This immobilization 
facilitates the removal of DNAs from the batch of resin coupled protein after binding, and allows 
subsequent re-use of the protein. Immobilization of the protein of the invention or part thereof can 
be accomplished, for example, by inserting any matrix-binding domain in the protein according to 
methods known to those skilled in the art. The resulting fusion product including the protein of the 

15 invention or part thereof is then covalently, or by any other means, bound to a protein, carbohydrate 
or matrix (such as gold, "Sephadex" particles, and polymeric surfaces). 

Another embodiment of the invention relates to methods of preparing antibodies directed 
against the protein of the invention or part thereof. Such antibodies may be used in co- 
immunoprecipitation procedures that enrich for chromatin fragments containing binding sites for the 

20 protein of the invention. This method may identify genes or regions of the human genome silenced 
by the deacetylase activity of the protein of the invention. Preferably, in samples containing 
fragments of native chromatin, antibodies directed against 414 and coupled to protein A or protein 
G sepharose beads are added to the mixture. Immunoprecipitation conditions are those known to 
those skilled in the art. After washings DNA fragments co-precipitated with 414 are extracted and 

25 further subcloned in routinely used cloning vectors. These DNA fragments are either sequenced 
and/or used as probes to screen genomic libraries. This procedure is very similar to the one used by 
Gould and collaborators to enriche for embryonic chromatin fragments containing sites for the 
homeotic Ubx protein (Gould et al., Nature 348:308-312 (1990)). 

A preferred embodiment of the invention relates to compositions or methods using SEQ ID 

30 NO: 414, SEQ ID NO: 173 or part thereof to diagnose, treat and/or prevent develop disorders 
caused by the expression of "disease causing" genes. The number of pathologies and conditions 
that could be treated by the protein of the invention is potentially huge and unlimited. Favored 
disorders linked to dysregulation of gene transcription such as cancer and other disorders relating to 
abnormal cellular differentiation, proliferation, or degeneration, including hyperaldosteronism, 

35 hypocortisolism (Addison's disease), hyperthyroidism (Grave's disease), hypothyroidism, colorectal 
polyps, gastritis, gastric and duodenal ulcers, ulcerative colitis, and Crohn's disease; viral infection 
especially HIV and viral hepatitis (i.e. expression of viral proteins), metabolic diseases such as 
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obesity and a number of inflammatory diseases due to interleukin over-expression. For diagnostic 
purposes, the expression of the protein of the invention could be investigated using any of the 
Northern blotting, RT-PCR or immunoblotting methods described herein and compared to the 
expression in control individuals. For prevention and/or treatment purposes, the protein of the 
5 invention may be overexpressed using any of the gene therapy methods known to those skilled in 
the art including those described herein. For example, switching off "disease" genes may be 
achieved by, for example, directly targeting the protein of the invention or part thereof to the genes 
(such as oncogenes in cancers) that are over-expressed in order to silence their expression. This 
could be achieved by making a "chimera" protein in which the putative zinc-binding domain is 

10 replaced by a sequence known to bind to, or near the over-expressed gene as explained elsewhere in 
the application. Fusion proteins containing both the deacetylase activity and the specific DNA 
binding domain are obtained by methods of molecular biology well known to those skilled in the 
art. The corresponding eukaryotic expression vectors may be use in gene therapy in the cases of 
cancer, metabolic disorders, aging and any disorder where a gene is over-expressed. Such 

1 5 recombinant cDNA may be introduced in the well known adenoviral vectors used in cancer therapy 
(for a recent review on the use of replicative adenoviruses for cancer therapy : Alemany et al., Nat. 
Biotechnol. 18:723-727 (2000)). 

Another related embodiment relates to the use of SEQ ID NO: 414, SEQ ID NO: 173, its 
complement, or any part thereof to develop antagonists of the protein of the invention and of the 

20 SIR complex. These antagonists could be antisense oligonucleotides, triple helices, ribozymes, 
small molecules or antibodies and may be used to treat disease and conditions caused by abnormal 
gene silencing. These conditions include accelerated aging syndromes such as Cochayne's 
syndrome, Ataxia telangiectasia and Werner's syndrome as well as age-associated diseases as well 
as "early onset" forms of diseases associated with old age such as dementia and Parkinson's 

25 disease. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, preferably brain 
tissues. For example, the protein of the invention or part may be used to synthesize specific 
antibodies using any techniques known to those skilled in the art including those described therein. 
30 Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 
forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 

Protein of SEP ID NO:298 H82-1-2-0-D12-CS) 

The protein of SEQ ID NO:298, encoded by the cDNA of SEQ ID NO:57, is homologous to 
35 proteins of the fibroblast growth factor family (FGF). Specifically the amino acid sequence of SEQ 
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ED NO:298 is identical to the recently described FGF-23. The protein of the invention is strongly 
expressed in the fetal liver. 

The protein of the invention presents the pfam signature for fibroblast growth factors 
(positions 48 to 129). High resolution X-ray structures of crystals of both FGF-1 and FGF-2 have 
5 been reported and reveal a "beta trefoil" topology, comprising 12 strands linked to form a three-fold 
symmetrical structure made up of four-stranded antiparallel beta sheet (see Faham S. et al. - Curr 
Opin Struct Biol. - 1998, 8(5): p578-586). On the basis of sequence conservation, it seems very 
likely that all members of the FGF family have related 3rdimesional structures. Preferred 
polypeptides of the invention are those that comprise amino acids 39 to 45; 51 to 56; 60 to 64; 71 to 

10 77; 82 to 87; 92 to 97; 101 to 105; 113 to 119; 124 to 130; 142 to 147; 151 to 155 and/or 167 to 
172, which by homology with other members of the FGF family make up the 12 beta pleated sheets 
characteristic of the FGF family (White K. et al. - Nat Genet. - 2000; 26(3): pp. 345-348). 
Furthermore, as within these regions a number of amino acids from SEQ ID NO:298 are conserved 
in over 80% of human FGFs (after sequence alignment), the most preferred polypeptides of the 

15 invention comprise amino acids 42, 53, 63, 83, 85, 87, 93, 96, 101, 113, 115, 124, 127 and/or 129. 
Other preferred polypeptides of the invention are any fragment of SEQ ID NO:298 having any of 
the biological activities described herein. 

Cytokines are a heterogeneous group of polypeptide mediators associated with numerous 
functions, including immune system and inflammatory responses. The cytokine families include, 

20 but are not limited to, Interleukins, Chemokines, Tumor necrosis factors, Interferons, Colony 

stimulating factors, Neurotrophic, neuropoietins and growth factors (of which the FGF family is a 
member). Fibroblast growth factors (FGFs) were first characterized, in the mid 1970s, as mitogens 
of cultured fibroblasts. Since then more then 20 different FGFs have been identified. Fibroblast 
growth factors belong to a family of proteins called growth factors (other members of this family 

25 include EGF, PDGF, TGFs and ECGF). The biological effects of FGFs are mediated by association 
with 3 biochemically distinct partners: heparan sulfate proteoglycans, a low affinity transmembrane 
FGF-binding protein and high-affinity transmembrane FGF receptors of the tyrosine-kinase class. 
Transfection and reconstruction experiments have shown that intracellular signal transduction is 
triggered by activation of FGF receptor kinase activity. Activation is brought about by receptor 

30 oligomerization, which is mediated by the association of heparan sulfate proteoglycans with the 
ligand (FGF) and of the ligand with the receptor itself (Faham S. et al. - Curr Opin Struct Biol. - 
1998, 8(5): p578-586). Longer heparin-derived oligosaccharides generally exhibit tighter binding to 
FGF. The relationship between heparin length, biological activity and FGF binding has been 
extensively studied and there is general agreement that longer heparin oligosaccharides tend to be 

35 more biologically active. FGFs are members of a family with a broad range of biological activities 
involving cell growth and differentiation (including angiogenesis, morphogenesis and wound 
healing); cell survival, replication, adhesion and mobility. FGFs have been found to be potent 
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growth factors for a number of cell types including, but not limited to fibroblasts, endothelial cells, 
smooth muscle cells, keratinocytes, osteoblasts and neurons. 

Clearly, FGF biology is potentially very complex, involving multiple ligands, receptors and 
cofactors, each expressed with different spatial and temporal patterns and distinct kinetics in the 
5 course of normal development. Considerable efforts have been expended on the creation of 
different types of animal models for the analysis of FGF function in vivo. These studies clearly 
indicate that FGF signaling is involved in a number of different processes at different stages of 
development and is critical in early developmental stages (FGF-4 and FGF receptor 1 homozygous 
null mutations both cause early lethality). FGF signaling has been found to be required for both 

10 branching morphogenesis of the lung and the establishment of the normal program of keratinocyte 
differentiation in the skin. FGF signaling has also been found to be involved in both the initial 
induction and sustained outgrowth of the limb bud during early limb development. Perhaps the 
most impressive illustration of this function of FGF signaling is the ability to induce supernumerary 
limb development in the chick by local application of FGF-soaked beads (Cohn M, et al. - Cell - 

15 1 995; 80: p739-746), thus indicating that a least some FGF-dependent processes are regulated by 
accessibility of an FGF ligand. FGFs are also capable of stimulating migration and differentiation 
of hepatic precursors. 

Recently mutations in the FGF-23 gene were found to be associated with autosomal 
dominant hypophosphataemic rickets (ADHR), a genetically transmitted disease characterized by 

20 low serum phosphorus concentrations, rickets, osteomalacia, lower extremities deformation, short 
stature, bone pain and dental abscesses. It seems very likely that FGF signaling functions are 
involved in numerous aspects of morphogenesis, differentiation and other essential cellular 
mechanisms, and are thus likely involved in any of a large number of diseases and conditions 
associated with these processes. 

25 Thus, it is believed that the protein of SEQ ID NO:298 is a member of the fibroblast growth 

factor family, and is thus involved in a large number of cellular and organismal proesses, including, 
but not limited to, cell growth and differentiation, angiogenesis, morphogenesis, wound healing, cell 
survival, replication, adhesion and mobility. 

One embodiment of the present invention relates to the use of the present polypeptides and 

30 polynucleotides to identify liver, heart, thyroid and parathyroid tissues, or cells derived from these 
tissues, since the protein of the invention is expressed therein (White K. et al. - Nat Genet. - 2000; 
26(3): p345-348). Such detection of cells expressing the protein can be carried out in any of a 
number of ways, including the use of specific antibodies or antiserum generated against the protein 
using standard methods, as well as using polynucleotide probes specific for nucleic acids encoding 

35 the protein of the invention. 

In another embodiment, the protein of the invention or part thereof can be used as a mitogen 
to stimulate the growth of a number of different cells types including, but not limited to, fibroblasts, 
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muscle cells, osteoblasts, keratinocytes and hepatocytes. The growth of cells can be stimulated in 
vitro, for example to promote the growth of cells cultured for the synthesis of recombinant proteins, 
or for ex vivo gene therapy applications. Another preferred application of this technique relates to 
the use of the protein of the invention or part thereof to generate in vitro tissues and organs 
5 including, but not limited to, skin, cartilage, and bone for transplants and grafts (Lancet - 1981, 
1(821 l):75-8)). 

Another preferred embodiment of the invention relates to the use of the invention or part 
thereof to treat damaged tissues and organs. Members of the FGF family have been shown to 
induce the differentiation and growth of a number of different cell types. Thus the protein of the 

10 invention can be administered to treat pathologies and conditions that result from damage to cells, 
tissues or organs. These pathologies and conditions include but are not limited to bone fractures 
and bone defects Kimoto et al. - J Dent Res - 1998, 77(12): pi 965-1 969) (Solheim E - Int Orthop - 
1998, 22(6), damage due to wounds (such as lesions of the skin and ulcers) (Debus E. - Zentralbl 
Chir-2000, 125 (supple 1): p49-55) (Szabo S. - Aliment Pharmacol Ther-2000, 14(Suppl 1): 

15 p33-43), tissue damage due to ischemia (for example, in the brain and heart) (Simons M - 
Circulation - 2000, 102(1 1): pE73-E86), cardiovascular diseases such as thrombosis and 
atherosclerosis (Bauters C. - Drugs - 1999, 58 (Spec Nol): pi 1-15) and neurodegenerative diseases 
due to neuronal loss such as Parkinson's disease or Alzheimer's disease (Ebadi M - Neurochem Int 
- 1997, 30(4-5): p347-374) (Brundin P.-Cell Transplant - 2000, 9(2): pl79-195). 

20 In a most preferred embodiment, the polypeptides or polynucleotides of the invention can 

be used to diagnose, treat, or prevent disorders resulting from non-functional and/or mutated FGFs, 
such as Autosomal dominant hypophosphataemic rickets, which is associated with mutation of 
certain amino acids of FGF-23 (White K. et al. - Nat Genet. - 2000; 26(3): p345-348, which is 
hereby incorporated by reference in its entity. Such disorders can be treated, for example, by 

25 administering a therapeutically effective amount of the protein of the invention or a polynucleotide 
sequence encoding the protein to a patient suffering from the disorder. Similarly, SEQ ID NO:298 
or SEQ ED NO:57 or any part thereof can be used to develop diagnostic kits in order to diagnose, 
prevent and/or treat any other disease associated with FGF, for example pathologies associated with 
FGF overexpression. 

30 In yet another embodiment, the protein of the invention or part thereof can be used to 

develop antogonists of FGF and/or FGF receptors in order to treat disorders associated with an 
over-activation of FGF pathways (for example, due to over production of FGF or overstimulation of 
FGF receptors). This is particularly true for pathologies such as cancers where some tumors secrete 
large quantities of FGF, such as prostate and breast cancers. Furthermore FGF antagonists can be 

35 useful in inhibiting tumor angiogenesis, which is an essential step in tumor growth. In the same 
way SEQ ID NO:57 or any part thereof could be used to generate antisense oligonucleotides. 
Antisense oligonucleotides block complementary mRNA, thus inhibiting the synthesis of the 
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protein encoded by the mRNA. These oligonucleotides can be used in in vivo or ex vivo treatment 
of the diseases caused or aggravated by overexpression of FGFs. 



Protein of SEP ID No: 396 (internal designation: 160-12-1-0-DlO-CS) 

The protein of SEQ ID No: 396 encoded by the cDNA of SEQ ID No: 155, overexpressed 
5 in brain and fetal brain, shows homology to memebers of the transmembrane 4 super family of 
proteins (TM4SF). Tthe protein of the invention displays signatures characteristic of this family, 
namely the pfam domain from positions 66 to 273, the Prosite motif from positions 1 12 to 134 as 
well as the emotif domains from positions 108 to 127, 108 to 146, 129 to 150, 128 to 154, and 247 
to 274. In addition, the protein of the invention has several predicted transmembrane domains: 103 

10 to 123, 130 to 150 and 245 to 265, with an additional predicted domain with lower certainty from 
positions 61 to 81. The protein of the invention has significant homology to a TM4SF member, the 
integral membrane CD81 antigen also known as TAPA1 (Target of Antiproliferative Antibody), 
except for its N-terminus. The transmembrane domains of the protein of the invention matches 
those described for CD81. 

15 Members of the tetraspan family of proteins are associated with adhesion molecules and 

translate adhesive events into a regulation of cellular behaviour. TAPA-1 is a widely expressed 
protein found to influence adhesion, morphology, activation, proliferation and differentiation of B, 
T and other cells. TAPA-1 has two long hydrophilic domains of the molecule which are 
extracellular and located between four TM (Transmembrane region, TM1 -4). The region between 

20 TM2 and TM3 is highly conserved in all tetraspanins. The protein is highly hydrophobic and 
contains a potential N-myristoylation site. TAPA1 functions by forming a complex on the cell 
surface and the antigenic epitope of the human TAPA1 is contained within a subregion of the 
second extracellular domain of the protein. Cell-surface expression of TAPA1 can be down- 
modulated by binding of antibodies (Levy 1991, J Biol Chem Aug 5 ;266(22): 14597-602). 

25 Mice lacking CD81 (not expressing TAPA1) have impaired antibody responses to protein 

antigens. This defect is specific to antigens that preferentially stimulate a T helper 2 response and is 
only seen with T cell-dependent antigens. Absence of CD81 on B cells is sufficient to cause the 
defect. Antigen-specific interleukin (UL) 4 production is greatly reduced in the spleen and lymph 
nodes of CD81-null mice compared with heterozygous littermates. The expression of CD81 on B 

30 cells is critical for inducing optimal IL-4 and antibody production during T helper 2 responses. 

CD81 is likely to have a greater role in the control of immune responses than in the development of 
immune cells (Maecker (1997) J Exp Med 1997 Apr 21 ;1 85(8): 1505-10). CD81 on B cells has the 
capacity to promote EL-4 secretion from T cells. Costimulatory proteins such as B7-1 and B7-2 
have been shown in some systems to have differential effects on cytokine secretion by T cells. 

35 CD81 on B cells, can control cytokine production by T cells. TAPA-1 has been implicated to play 
an important role in the regulation of lymphoma cell growth. 
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TAPA-1 is highly expressed in many neurons of the brainstem. TAPA is found in all glial 
cells, and the level of this protein correlates with their maturation (Sullivan et al., 1998, J Comp 
Neurol 1998 Jul 6;396(3):366-80). This protein is expressed by ependyma, choroid plexus, 
astrocytes, and oligodendrocytes. TAPA1 is dramatically upregulated during early postnatal 
5 development, at the time of glial birth and maturation. At embryonic day 18, the levels of TAPA 
are low, with most of the immunoreaction product being associated with the ependyma, choroid 
plexus, and the glia limitans. The amount of TAPA expressed in the brain increases with brain 
development, and at postnatal day 14 the protein levels approach those of the adult. This increase in 
the levels of TAPA at postnatal day 14 is due to upregulation in the gray matter and white matter. 

10 TAPA has been associated with reactive gliosis and the glial scar. The spatiotemporal expression 
pattern of CD81 by reactive microglia and astrocytes indicates that CD81 is involved in the glial 
response to spinal cord injury. It is suggested that the upregulation of TAPA is an integral 
component of glial scar formation (Peduzzi et al, Exp Neurol. 1999 Dec;160(2):460-8). 

The levels of TAPA-1 are low in metastatic prostate tumors, expression of this protein in 

15 these cells appears to suppress metastatic behavior (Dong et al., 1995 Science. 12;268(5212):884- 
6.). Bivalent antibodies directed against these proteins can be used to enhance adhesion of different 
cell types: pre-B cells (Masellis-Smith and Shaw (1994) J Immunol 1994 Mar 15;152(6):2768-77), 
endothelial cells (Forsyth, 1991 Immunology Feb;72(2):292-6), and tumor cell mobility and 
invasiveness (Miyake et al., 1991 J Exp Med. Dec 1 ; 174(6): 1347-54.). In the nervous system, the 

20 migratory behavior of Schwann cells over biologically relevant substrates can be enhanced with the 
application of antibodies directed against certain TM (Anton et al (1995) J Neurosci. Jan;15(l Pt 
2):584-95). Antibodies directed against TAPA-1 depress the mitotic activity and induce an increase 
in cellular adhesion (Oren et al, 1990 Mol Cell Biol Aug;10(8):4007-15). 

It is believed that the protein of SEQ ID NO: 396 or part thereof plays a role in cell 

25 adhesion, motility, metastasis, cell activation, signal transduction and the immune response, 

probably as a member of the TM4SF family. As a member of the tetraspanin family of proteins, the 
protein of SEQ ID No: 396 or part there of is believed to mediate cellular interaction in lymphoid 
cells as well as non-hematolymphoid tissue to affect cell adhesion and migration, alter cell 
morphology and the activation state of a cell. Preferred polypeptides of the invention are 

30 polypeptides comprising the amino acids of SEQ ID NO: 396 from positions 66 to 273, 1 12 to 134, 
108 to 127, 108 to 146, 129 to 150, 128 to 154, and 247 to 274. Other preferred polypeptides of the 
invention are fragments of SEQ ED NO: 396 having any of the biological activity described herein. 
The activity of the protein of the invention or part thereof may be assayed using any of the assays 
known to those skilled in the art including those describing a functional tissue assay used to define 

35 surface antigens regulating astrocyte growth (Eldon et al, 1996, J Neurosci, 16(17):5478); cellular 
function assays determining the involvment of the protein in signal transduction and cell adhesion 
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in the immune system (Levy et al, 1998 Ann. Rev. Immunol. 16:89-109, Virtaneva et al, 1994 
Immunogenetics 39: 329-334). 

An embodiment of the present invention relates to methods of using the protein of the 
invention or part thereof to identify and/or quantify membrane proteins, preferably integrins, 
5 lineage specific molecules, tetraspanins, and antibodies, in a biological sample, and thus used in 
assays and diagnostic kits for the quantification of such membrane proteins in tissue sample, and in 
mammalian cell cultures. The binding activity of the protein of the invention or part thereof may be 
assessed using the assay described in Shoshana et al, 1998, Annu. Rev. Immunol 16: 89-109; 
Maecker et al, 1998 PNAS 95: 2458-2462; Geisert et al, 1996, J of Neuroscience 16(17): 5478-5487 

10 or any other method familiar to those skilled in the art. Preferably, a defined quantity of the protein 
of the invention or part thereof is added to the sample under conditions allowing the formation of a 
complex between the protein of the invention or part thereof and the membrane protein to be 
identified and/or quantified. Then, the presence of the complex and/or or the free protein of the 
invention or part thereof is assayed and eventually compared to a control using any of the 

15 techniques known by those skilled in the art. 

In another embodiment, the invention relates to compositions and methods using the protein 
of the invention or part thereof to stimulate cell proliferation, preferably proliferation of lymphoid 
cells both in vitro and in vivo. For example, soluble forms of the protein of the invention or part 
thereof may be added to cell culture medium in an amount effective to stimulate cell proliferation. 

20 In another embodiment, the invention relates to compositions and methods using the protein 

of the invention or part thereof or derivative thereof to stimulate antibody production either in vitro 
or in vivo. In a preferred embodiment, the protein of the invention or part thereof or derivative 
thereof may be added in an effective amount to stimulate antibody production to an in vitro culture 
of antibody-producing cells, such as hybridomas. In another preferred embodiment, the protein of 

25 the invention or part thereof or derivative thereof may be injected into an animal in order to increase 
the animal's antibody production to a protein of interest in the case of production of polyclonal 
antibodies. 

In still another embodiment, the invention relates to compositions and methods using the 
protein of the invention or part thereof or derivative thereof to decrease cell adhesion either in vitro 

30 or in vivo. In a preferred embodiment, the protein of the invention or part thereof or derivative 
thereof may be added in an effective amount to prevent and/or inhibit cell adhesion to an in vitro 
culture of adherent cells in order to recover those adherent cells. 

In still another embodiment, , the invention relates to compositions and methods using the 
protein of the invention or part thereof to treat and/or prevent cell-proliferative disorders, such as 

35 cancers,, via the prevention of metastatispreferably brain cancer, and disorders characterized by 
depressed immune response such as autoimmune diseases AIDS, allergy, typel diabetes, systemic 
lupus erythematosus, chronic rheumatoid arthritis, juvenile rheumatoid arthritis, Sjogren's 
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syndrome, systemic scleriasis, mixed connective tissue disease and dermatomyositis, Hashimoto's 
disease, primary myxedema, thyrotoxia, pernicious anemia, ulcerative colitis, autoimmune atrophic 
gastritis, idiopathic Addison's disease, male infertility, Goodpasture's syndrome, acute progressive 
glomerular nephritis, myasthenia gravis, multiple myositis, pemphigus vulgaris, bullous 
5 pemphigoid, sympathetic ophthalmia, multiple sclerosis, autoimmune hemolytic anemia, idiopathic 
thrombocytopenic purpura, postmyocardial infarction syndrome, rheumatic fever, lupoid hepatitis, 
primary biliary cirrhosis, Behcet's syndrome and Crest's syndrome, via the stimulation of antibody 
production and EL-4 secretion. 

In another embodiment, the invention relates to methods and compositions using the protein 

10 of the invention or part thereof as a marker protein to selectively identify tissues, preferably from 
brain and fetal brain origin. For example, the protein of the invention or part may be used to 
synthesize specific antibodies using any techniques known to those skilled in the art including those 
described therein. Such tissue-specific antibodies may then be used to identify tissues of unknown 
origin, for example, forensic samples, differentiated tumor tissue that has metastasized to foreign 

15 bodily sites, or to differentiate different tissue types in a tissue cross-section using 
immunochemistry or any other technique known to those skilled in the art. 

Protein of SEP ID No: 296 (internal designation 1 81-3-3-0-B8-CS> 

The protein of SEQ ID NO: 296 encoded by the cDNA of SEQ ID NO: 55, overexpressed in 
fetal liver, is homologous to the whole domain IV-4 and IV-5 of the basement membrane-specific 

20 heparan sulfate proteoglycan core protein (perlecan), well conserved among C.elegans, mice and 
human (accession numbers Q06561, Q05793 and P98160 respectively). The 247-amino-acid-long 
protein of the invention, displays two putative hydrophobic stretches from positions 44 to 64 and 
219 to 239 and a putative immunoglobulin domain from positions 141 to 1 97, homolog to the Ig 
domain 4 of the Ig repeat structure of domain IV of perlecan proteins. The protein of the invention 

25 displays also a putative secreted signal peptide from positions 6 to 21 . 

Basement membranes are specialized regions of extracellular matrix (ECM) containing a 
large number of different components, including laminin, collagen, nidogen and heparan sulfate 
proteoglycans (for a review see Bemfield at al., Annu. Rev. Biochem. 68:729-777 (1999)). 
Perlecan, a major basement membrane, plays important roles in many fundamental development 

30 and regenerative processes, including cell cohesion, adhesion and migration, signal transduction, 
and even gene regulation (Martin and Timpl, Annu. Rev. Cell Biol. 3:57-85 (1987)). The cDNA 
sequence of perlecan encodes a large core protein consisting of five structural domains, referred 
from I to V, with distinct motifs such as- SEA modules (domain I), LDL class A modules (domain 
II), cysteine-rich LE modules (domain III), LAMB modules (domain V). Domain IV consists of Ig- 

35 like repeats (14 in mice, 21 in human perlecan) similar to those of neural cell adhesion molecules 
(N-CAMs). Glycosaminoglycan chains are mostly linked to domain I and have been shown to 
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participate with the core protein in its differential expression in tissues and development stages 
(Perrimon and Bernfield, Nature 404:725-728 (2000)). 

The N-terminal fragment IV containing Ig modules from 1 to 8 show high-affinity binding 
to the two known nidogen isoforms, lamininl-nidogenl complex (LN) and binding to heparin at 
5 physiological ionic strength (Hopf et al., Eur. J. Biochem. 259:917-925(1999)). An alteration study 
of the C. elegans perlecan show in vivo that mutations in Ig-like modules 3 and 4 induce a lethal 
phenotype by inhibiting the spatial distribution of the splice variants. Mutations inducing deletions 
in other Jg-like modules of perlecan domain IV does neither affect the isoform expression nor the 
spatial distribution, suggesting a crucial role of lg-like modules 3 and 4 in muscle assembly and 

10 development stages (Mullen et al., Mol. Biol Cell 10:3205-3221 (1999)). 

Several studies have shown the large presence of distinct perlecan isoforms through 
regulated alternative splicing in C. elegans (Rogalski et al, Genes Dev. 7:1471-1484 (1993), 
Rogalski et al, Genetics 139: 159- 169(1 995)). Although splice variants have not yet been shown in 
human, lg-like modules are encoded by multiple exons compatible with different combinatorial 

15 possibilities of expression (Cohen et al., P.N.A.S. 90:10404-10408 (1993)). Alternative splicing 
within Domain IV is associated with temporal and spatial differences in isoform expression. A 
subset of C.elegans isoforms are associated with body- wall muscles during embryiogenesis and are 
required for nematode myofilament lattice assembly, which is very similar to assembly of focal 
adhesions in mammalian cell culture (Moerman and Fire, CSH labo. Press (1997)). 

20 Basement membrane-like structure containing perlecan, collagen IV, laminin also plays a 

major role during liver differentiation by interacting with immature hepatocytes (Am. J. 
Path.l42:199-208(1993)). 

Perlecan have been implicated in a number of processes and diseases resulting from the 
alteration of its structure including glomerular filtration deficiencies such as proteinuria, diabetic 

25 glomerulopathies, nephrotic syndromes, Denys-Drash syndromes (Groffen at al., Nephrol. Dial. 
Transplant 14:21 19-2129 (1999)), mitogenesis and angiogenesis diseases (Aviezer et al., Cell 
79:1005-1013 (1994)), inflammation and tissue repair, ocular and skeletal defect syndromes, 
microbial pathogenesis through invasion. Perlecan core protein has binding epitopes for the 
basement membrane proteins nidogen-1, nidogen-2, and fibulin-2, as well as for Alzheimer's beta- 

30 amyloid protein (Snow et al.,Arch. Biochem. Biophys. 320:84-95 (1995)) and platelet growth 
factor. 

It is believed that protein of SEQ ID NO: 296 or part thereof is a membrane basement-like 
protein, preferably a human isoform of the perlecan protein. Thus, the protein of the invention 
plays an important role in membrane integrity and interactions with other basement proteins and 
35 particularly with nidogen-1 and 2, LN complexes and heparin coumpounds. Besides, the protein of 
the invention is thought to participate in the interactions with cellular receptors such as integrins, 
with cytokine release and proteolysis, with regulation of angiogenesis , wound healing and tumor 
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invasion. Being overexpressed in the fetal liver, the protein of the invention is thought to participate 
in the differentiation, migration and adhesion of hepatocytes trought its spatial and temporal 
expression during embryogenesis. Preferred polypeptides of the invention are polypeptides 
comprising the amino acids of SEQ ID NO: 296 from positions 141 to 197. Other preferred 
5 polypeptides of the invention are fragments of SEQ ID NO: 296 having any of the biological 

activity described herein. The activity of the protein of the invention or part thereof may be assayed 
using any of the assays known to those skilled in the art including those described in Hopf et al., 
Euro. J. Biochem. 259:917-925(1999) for binding assays with other membrane proteins and in 
Rescan et al., Am. J. Path. 142:199-208 (1993) for protein assays (Immunochemistry and ELISA). 

10 In one embodirnen, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a new marker protein to selectively identify embryogenic stages, 
preferably in liver tissues. For example, the protein of the invention or part thereof may be detected 
using specific antibodies able to bind to the protein using any technique known to those skilled in 
the art. Such tissue-specific antibodies may then be used to identify embryogenic cells with 

15 dysregulated membrane components such as in differentiated tumor cells or to differentiate different 
cell types in a tissue cross-section using immunochemistry. For example, the amount of the protein 
of the invention in embryogenic cells reflecting the characterized overexpression activity is 
measured and compared to that of a normal cell using a specific antibody detected by fluorescence 
(FACS, confocal microscopy,...) or any other detection methods skilled in the art. 

20 In another embodiment the invention relates to methods and compositions using the protein 

of the invention or part thereof for the diagnosis of a disorder associated with overexpression of the 
protein of the invention, preferably but not limited to perlecan associated tumors such as human 
melanoma, proliferative diseases, glomerular filtration deficiencies such as proteinuria, diabetic 
glomerulopathies, nephrotic syndromes, Denys-Drash syndromes, mitogenesis and angiogenesis 

25 diseases, inflammation and tissue repair, ocular and skeletal defect syndromes, microbial 

pathogenesis through invasion. The expression of the protein of the invention could be investigated 
using any methods well known to those skilled in the art, including Northern blotting, RT-PCR or 
immunoblotting using specific antibodies binding to the protein of the invention. 

In still another embodiment the protein of the invention or part thereof could be used as a 

30 mitogen to stimulate the growth and differentiation of a number of different cells types including 
but not limited to fibroblasts, muscle cells, osteoblasts, keratinocytes and hepatocytes. In a 
preferred embodiment, the protein of the invention or part thereof is used in in vitro cultures such as 
those used for synthesis of recombinant proteins. The protein of the invention or part thereof is 
added to the culture in an amount effective to stimulate proliferation and/or differentiation. A more 

35 preferred application of this technique relates to the use of the protein of the invention or part 

thereof in generating or reparing in vitro tissues and organs such as but not limited to skin, cartilage, 
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and bone for transplants and grafts (Lancet - 1981, 1(821 l):75-8), which disclosure is hereby 
incorporated by reference in its entirety). 

In another embodiment, an antagonist of the protein of SEQ DD NO:296 may be 
administered to a subject to treat or prevent a cell proliferative disorder. Such disorders may 
5 include, but are not limited to, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 
polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 

10 gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, 

prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In one aspect, an antibody 
which specifically binds to the protein of the invention may be used directly as an antagonist or 
indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue 
which express the protein of the invention. In another example, antisense nucleotides, triple helices, 

15 Genetic Suppressor Elements (GSE), ribozymes designed from nucleotides encoding the protein of 
the invention or part thereof using any methods to those skilled in the art are administrated to inhibit 
the expression of the protein of the invention. 

Protein of SEP ID NO: 410 (internal designation 179-9-4-0-B8-CS) 

The protein of SEQ ID NO: 410 encoded by cDNA of SEQ ID NO: 169 found in fetal 

20 kidney is homologous to the proteins of ankyrin family protein and the proteins containing a 
characteristic ankyrin repeated motif (pfam accession number : PF00023). The protein of the 
invention shows homology with human ankyrin proteins (PIR accession number A35049 ; SP- 
TREMBL accession number : Q99407 )), ankyrins from several different eucaryote species 
(Drosophila melanogaster : STR accession number Q9VAU5 ; mouse : STR accession number 

25 Q61302 and SwissProt aceessiion number Q02357 ; cow : STR accession number AAF61702 ; 
Arabidopsis thaliana : STR accession number Q9ZQ79 ) and ankyrins from procaryote species 
{Paramecium bursaria Chlorella virus : STR accession number STR Q41 164). 

In addition, the protein of the invention shows homology with other proteins containing 
ankyrin repeat motif. The ankyrin repeat motif is a 33 amino acid motif and has an L-shaped 

30 structure consisting of two alpha helices following the beta hairpin loop (Gorina et al., Science.274- 
1005 (1996)). Examples of proteins comprising ankyrin repeats include: channels , enzymes toxins , 
transcription factors (Palek et al., Semin. Hematol. 27:290-332 (1990)), tankyrase (Smith et al., 
Science.282:1484-1487 (1998)) , multiple proteins involved in signal transduction, in particular 
integrin-1 inked kinases (Huang et al., Int. Mol. Med. 3:563-572 (1999)), inhibitors of cyclin- 

35 dependent kinases (Baumgartner et al., Structure. 6:1279-1290 (1998)), death-associated protein 
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kinase involved in apoptosis (Raveh et al., Proc. Nath. Acad. USA. 15:1572-1577 (2000)) and many 
others. 

The ankyrin motif is also found in the protein of the invention (position 47 to 79). 
Ankyrins are peripheral membrane proteins which have been found in erythrocyte, kidney 
5 and neuronal cells of mammals. Cells contain a cytoskeleton that links intracellular compartments 
with each other and the plasma membrane. Associations between the cytoskeleton and the lipid 
membranes bounding these compartments involve spectrin, ankyrin, and integral membrane 
proteins. Spectrin is a major component of the cytoskeleton and acts as a scaffolding protein. 
Similarly, ankyrin acts to tether the actin-spectrin moiety to membranes and to regulate the 

10 interaction between the cytoskeleton and membranous compartments. Different ankyrin isoforms 
are specific to different organelles and provide specificity for this interaction. Genes coding for 
three different mammalian ankyrins (ankyrin R , ankyrin B and ankyrin G ) have been cloned. Ankyrin R 
was originally identified as part of the erythrocyte membrane skeleton, and was recently also 
localized to the plasma membrane of a subpopulation of post mitotic neurons in rat brain (Lambert, 

15 et al., 1993, J. Neurosci., 13, 3725-3735). Ankyrin B is a developmentally regulated human brain 
protein which has two alternatively spliced isoforms with molecular masses of 220 kilodaltons (kD) 
and 440 kD (Kunimoto, et al., 1991, J. Cell Biology, 1 15, 1319-1331). Ankyrin G is a more recently 
isolated human gene that encodes two neural-specific ankyrin variants (480 kD and 270 kD), which 
have been localized to the axonal initial segment and node of Ranvier (Kordeli, et al., 1995, J. Biol. 

20 Chem., 270, 2352-2359). Studies on mammalian ankyrins indicate that ankyrins bind a variety of 
proteins which have functions involved with the anion exchanger (Drenckhahn, et al., 1988, 
Science, 230, 1287-1289), Na+/K+-ATPase, amiloride-sensitive sodium channel in kidney (Smith, 
et al., 1991, Proc. Natl. Acad. Sci. U.S.A., 88, 6971-6975), voltage dependent sodium channel of 
the brain and the neuromuscularjunction (Srinivasan, et al., 1988, Nature, 333, 177-180), and 

25 nervous system cell adhesion molecules (Davis, et al., 1994, J. Biol. Chem., 269, 27163-27166). 

Analyses of mammalian ankyrins have revealed that these large proteins are divided into 
three functional domains. These include an N-terminal membrane-binding domain of about 89-95 
kD, a spectrin-binding domain of about 62 kD, and a C-terminal regulatory domain of about 50-55 
kD. The membrane-binding domain is primarily comprised of tandem repeats of about 33 amino 

30 acids each. This domain usually has about 22-24 copies of these repeats. The repeat units appear to 
function in binding to membrane proteins such as anion exchangers, sodium channels, and certain 
adhesion molecules. The spectrin-binding domain, as the name implies, functions in binding to the 
spectrin-based cytoskeleton of cells positioned inside the plasma membrane. Finally, the regulatory 
domain, which is the most variant domain among the different ankyrins that have been studied, 

35 appears to function in as a repressor and/or an activator of the protein-binding activities of the other 
two domains. Some of the variability seen in this domain among different ankyrin species appears 
to be the result of alternative splicing of nascent transcripts. The regulatory domain can respond to 
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cellular signals, allowing remodeling of the cytoskeleton during the cell cycle and differentiation 
(Lambert, S. and Bennett, V. (1993) Eur. J. Biochem. 211:1-6). Ankyrin may be target for action of 
parasites. Erythrocyte ankyrin is cleaved by parasite proteases of Plasmodium falciparum 
destabilizing erythrocyte membrane skeleton which facilitates parasite release (Raphael et al., Mol 
5 Biochem Parasitol. 1 10(2):259-272 (2000)). Recently, novel ankyrin proteins have been isolated 
from Dirofilaria and Brugia which may be useful in protecting animals, including humans, from 
diseases caused by parasitic helminths (United States Patent No. 6,063,599). 

Ankyrin sequences have been identified in various libraries, at least 50% of which are 
associated with cancer and at least 23% of which are associated with the immune response. Of 

10 particular note is the expression of ANFP in reproductive and hematopoietic/immune, and 
gastrointestinal tissues. See United States Patent 5,989,863. 

It is believed that the protein of SEQ. ID. NO: 410 is a member of the family of human 
ankyrin proteins and as such plays a role in regulating the interaction between the cytoskeleton and 
membranous components. The identification of a new member of the ankyrin family and the 

15 polynucleotides encoding it addresses a need in the art by providing new compositions which are 
useful in the diagnosis, prevention, and treatment of autoimmune/inflammatory, cell proliferative, 
vesicle trafficking disorders and in modulating the response to infectious diseases. 

Preferred polypeptides of the invention are polypeptides comprsing the amino acids from 
positions 47 to 79. Other preferred polypeptides are fragments of SEQ.ID.NO: 410 having the 

20 desired biological activity. Further included in the invention are the polypeptides encoded by the 
human cDNA of clone 179-9-4-0-B8-CS. The polypeptides of SEQ ID NO: 410 may be 
interchanged with the corresponding polypeptides encoded by the human cDNA of clone 1 79-9-4- 
0-B8-CS. Further included in the invention are polynucleotides encoding said polypeptides. 
Preferred polynucleotides are those of SEQ ID NO: 169 and of the human cDNA of clone 179-9-4- 

25 0-B8-CS. 

The invention also encompasses variants of the protein of the invention. A preferred variant 
is one which has at least about 80%, more preferably at least about 90%, and most preferably at 
least about 95% amino acid sequence identity to the amino acid sequence of SEQ.ID.NO: 410, and 
which contains at least one functional or structural characteristic of ankyrin. 

30 In a particular embodiment, the invention encompasses a polynucleotide sequence 

comprising the sequence of SEQ ID NO: 410, as well as variants of that sequence. Variants which 
encode at least one functional region characteristic of the ankyrin protein of the present invention 
are encompassed. Codon usage may be varied according to standard techniques in order to enhance 
expression in various hosts. 

35 In one embodiment, the protein of SEQ.ID.NO: 410 or a fragment or derivative thereof may 

be administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of ankyrin. Examples of such disorders include, but are not limited to, 
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autoimmune/inflammatory disorders such as acquired immunodeficiency syndrome (AIDS), 
Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, 
amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune 
thyroiditis, autoimmune polyenodocrinopathy-candidiasis-ectodermal dystrophy (APECED), 
5 bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, 
diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis 
fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, 
Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple 
sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, 

10 pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's 
syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, 
thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, 
hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and 
helminthic infections, and trauma; cell proliferative disorders such as actinic keratosis, 

15 arteriosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), 

myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary 
thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, 
myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, 
bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, 

20 lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, 
thymus, thyroid, and uterus; and vesicle trafficking disorders such as cystic fibrosis, glucose- 
galactose malabsorption syndrome, hypercholesterolemia, diabetes mellitus, diabetes insipidus, 
hyper- and hypoglycemia, Grave's disease, goiter, and Cushing's disease, ulcerative colitis, and 
gastric and duodenal ulcers. 

25 In another embodiment, a vector capable of expressing the protein of SEQ.ID.NO: 410 or a 

fragment or derivative thereof may be administered to a subject to treat or prevent a disorder 
associated with decreased expression or activity of ankyrin including, but not limited to, those 
described above. 

In a further embodiment, a pharmaceutical composition comprising a substantially purified 
30 protein of SEQ.ID. NO. 410 or a portion of the protein in conjunction with a suitable 

pharmaceutical carrier may be administered to a subject to treat or prevent a disorder associated 
with decreased expression or activity of the same or a similar protein including, but not limited to, 
those provided above. 

In still another embodiment, an agonist of the protein of SEQ. ID. NO. 410 which 
35 modulates the activity of the protein may be administered to a subject to treat or prevent a disorder 
associated with decreased expression or activity of the protein, or other ankyrin proteins, including, 
but not limited to, those listed above. 
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In another embodiment, the polypeptide of SEQ. ID. NO. 410 may be used to produce 
antagonists using methods which are generally known in the art. In particular, purified polypeptide 
may be used to produce antibodies or to screen libraries of pharmaceutical agents to identify those 
which specifically bind ankyrin proteins. Neutralizing antibodies (i.e., those which inhibit dimer 
5 formation) can also be prepared for therapeutic use. 

In a further embodiment, an antagonist of the protein of SEQ. ID. NO. 410 may be 
administered to a subject to treat or prevent a disorder associated with increased expression or 
activity of the same protein or other members of the ankyrin family of proteins. Such disorders may 
include, but are not limited to, those discussed above. In one aspect, an antibody which specifically 
10 binds the claimed polypeptide may be used directly as an antagonist or indirectly as a targeting or 
delivery mechanism for bringing a pharmaceutical agent to cells or tissue which express the 
polypeptide. 

In an additional embodiment, a vector expressing the complement of the polynucleotide of 
SEQ. ID. NO. 169 may be administered to a subject to treat or prevent a disorder associated with 
15 increased expression or activity of ankyrin proteins including, but not limited to, those described 
above. 

In other embodiments, any of the proteins, antagonists, antibodies, agonists, complementary 
sequences, or vectors of the invention may be administered in combination with other appropriate 
therapeutic agents. The combination of therapeutic agents may act synergistically to effect the 

20 treatment or prevention of the various disorders described above. Using this approach, one may be 
able to achieve therapeutic efficacy with lower dosages of each agent, thus reducing the potential 
for adverse side effects. 

In another embodiment of the invention, the polynucleotides encoding the polypeptide of 
SEQ. ID. NO. 410, or any fragment or complement thereof, may be used for therapeutic purposes. 

25 In one aspect, the complement of the polynucleotide encoding the above-identified polypeptide may 
be used in situations in which it would be desirable to block the transcription of the mRNA. In 
particular, cells may be transformed with sequences complementary to polynucleotides encoding 
the polypeptide. Thus, complementary molecules or fragments may be used to modulate the activity 
of the claimed polypeptide or related ankyrin proteins, or to achieve regulation of gene function. 

30 In another embodiment of the invention, the nucleotide sequence encoding the polypeptide 

of SEQ. ED. NO. 410 can be used to turn off the genes expressing the polynucleotide or related 
ankyrin proteins. In particular, a cell or tissue can be transformed with expression vectors which 
express high levels of the polynucleotide, or fragment thereof. Such constructs may be used to 
introduce untranslatable sense or antisense sequences into the cell. Expression vectors derived from 

35 retroviruses, adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be 
used for delivery of the nucleotide sequences to a targeted organ, tissue, or cell population. 
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An additional embodiment of the invention relates to the administration of a pharmaceutical 
or sterile composition, in conjunction with a pharmaceutically acceptable carrier, for any of the 
therapeutic effects discussed above. The composition may be delivered via a variety of different 
routes. 

5 In another embodiment, antibodies which bind the polypeptide of SEQ. ID. NO. 410 may 

be used for the diagnosis of disorders characterized by expression of ANFP, or in assays to monitor 
patients being treated with the polypeptide, other ankyrin proteins or agonists, antagonists, or 
inhibitors of the same. A variety of assay types, including ELISAs, RIAs, and FACS, can be used. 
In another embodiment of the invention, the polynucleotide of SEQ. ID. NO. 169 itself, 

10 may be used for diagnostic purposes. The polynucleotide can be used to generate oligonucleotide 
sequences, complementary RNA and DNA molecules, and PNAs which are useful in diagnosis. The 
polynucleotide and related molecules may be used to detect and quantitate gene expression in 
biopsied tissues in which expression of the polypeptide of SEQ. ED. NO. 410 or other ankyrin 
polypeptides may be correlated with disease. The diagnostic assay may be used to determine 

1 5 absence, presence, and excess expression of the polypeptides, and to monitor regulation of 
polypeptide levels during therapeutic intervention. Examples of diagnostic methods include 
Southern or Northern analysis, dot blot, or other membrane-based technologies; in PCR 
technologies; in dipstick, pin, and multiformat ELISA-like assays; and in microarrays utilizing 
fluids or tissues from patients to detect altered ANFP expression. Such qualitative or quantitative 

20 methods are well known in the art. Such assays may also be used to evaluate the efficacy of a 
particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the 
treatment of an individual patient. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 

polynucleotide sequences described herein may be used as targets in a microarray. The microarray 
25 can be used to monitor the expression level of large numbers of genes simultaneously and to 

identify genetic variants, mutations, and polymorphisms. This information may be used to 

determine gene function, to understand the genetic basis of a disorder, to diagnose a disorder, and to 

develop and monitor the activities of therapeutic agents. 

In another aspect of the invention, the polypeptide may be used to stimulate the expression 
30 of genes that have a role in organ and organ system development. Thus, in a preferred embodiment, 

the protein of the invention, a fragment, or derivative thereof, may be administered to a subject to 

treat or prevent developmental disorders. 

In a further embodiment, the protein of the invention may be administered to a subject to 

treat or prevent a cardiovascular disorder. Such disorders can include, but are not limited to, 
35 arteriosclerosis including atherosclerosis and nonatheromatus arteriosclerosis, hypertension, stroke, 

coronary artery disease, ischemia, myocardial infarction, angina pectoris, cardiac arrhythmias, 

sinoatrial node blocks, atrioventricular node blocks, chronic hemodynamic overload, aneurysm, 
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Jervell and Lange-Nielsen syndrome, and long QT syndrome. The protein of the invention may also 
be used as a marquer of cardiac hypertrophy so it may be included in diagnosis kit for this disease. 

In another embodiment of the invention, the polypeptide and/or polynucleotide may be used 
to inhibit cellular proliferation and to treat and/or diagnose disorders associated with cellular 
5 proliferation including but not limited to cancer. 

In a further embodiment of the invention, an antagonist of the protein of the invention may 
be administered to a subject to treat or prevent a cancer. In one aspect, an antibody which 
specifically binds the protein of the invention may be used directly as an antagonist or indirectly as 
a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue which 
1 0 express the protein of the invention. 

In yet another embodiment, an antagonist of the protein of the invention may be 
administered to a subject to treat or prevent a neuronal disorder. Such a disorder may include, but is 
not limited to, akathesia, Alzheimer's disease, amnesia, amyotrophic lateral sclerosis, bipolar 
disorder, catatonia, cerebral neoplasms, dementia, depression, diabetic neuropathy, Down's 
15 syndrome, tardive dyskinesia, dystonias, epilepsy, Huntington's disease, peripheral neuropathy, 
multiple sclerosis, neurofibromatosis, Parkinson's disease, paranoid psychoses, postherpetic 
neuralgia, schizophrenia, and Tourette's disorder. In one aspect, an antibody which specifically 
binds the protein of the invention may be used directly as an antagonist or indirectly as a targeting 
or delivery mechanism for bringing a pharmaceutical agent to cells or tissue which express the 
20 protein of the invention. 

In another embodiment, the protein of the invention can be administered to a subject to treat 
or prevent a malaria. The protein of the invention may be used also in diagnosis of malaria. 

In yet another embodiment, the polynucleotide and/or the polypeptide of the present 
invention can be used as a therapeutic composition capable of protecting an animal from a disease 
25 caused by a parasitic helminth. The polypeptide can be used a target for antiparasitic vaccines and 
drugs. 

Ankyrin has been shown to underlie membrane proteins including CD44, the voltage- 
dependent sodium channel, NA+/K+ ATPase and the anion exchanger protein. It is believed that the 
formation of a direct connection between ankyrin and functionally important transmembrane 
30 proteins/membrane skeleton may be one of the earliest events to occur during signal transduction 
and cell activation. Thus, in a further embodiment, the polypeptide of the present invention can be 
used to disrupt the connection between ankyrin and the membrane thus affecting fundamental 
processes within the cell. 

The polypeptide of the present invention can be further used to screen for compounds that 
35 inhibit or enhance the binding of ankyrin binding proteins and to affect the association between 
ankyrin and proteins, such as Alpha-Na,K-ATPase, which are critical to intracellular transport of 
ions and nutrients. 
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In yet another embodiment the regulatory domain of the polypeptide of SEQ. ID.NO. 410 or 
antagonists thereof can be used to enhance or disrupt the protein binding activities of the other 
domains. 

Proteins of SEP ED NOs: 385 and 416 fintemal designations 1 05-02 1-3-0-C3-CS and 188-31-1-0- 
5 E6-CS respectively) 

The 354 amino acids protein of SEQ ID NO: 385 encoded by the cDNA of SEQ ID NO: 
144 found in brain displays 6 kelch motifs (pfam accession number PF01344 ) at positions 20-66, 
68-1 14, 1 16-162, 164-209, 21 1-265 and 270-316. Morevoer, 4 residues conserved in over 90% of 
kelch family sequences are found in the protein of invention: di-glycine at positions 133-134, 

10 tyrosine 148 and tryptophan 155. In addition, six residues separate the tyrosine 148 and the 

tryptophan 155: this feature is conserved in over 70% of kelch proteins (Adams et al. 5 trends in cell 
biology, 10:17-24, 2000). The proteins of the invention encoded by the cDNA of SEQ ID NO: 144 
is a polymorphic variant of the protein of SEQ ID NO: 416 encoded by the cDNA of SEQ ID NO: 
175, thought to have the same functions and utilities. 

15 Drosophila kelch is located in the ring canals which are actin-rich bridges. Kelch localizes 

to the rim of preformed canals and serves to maintain actin organization (Xue et al., Cell (72)681- 
693, 1993; Robinson et al., J. Cell Biol. (138)799-810, 1997). In mammalian sperm, calicin is 
located within an actin-negative structure termed the calyx, which is involved in the morphogenesis 
of the spermatocyte (von Bulow et al., Exp. Cell Res. (219)407-413, 1995). Calcin and a well- 

20 structured calyx are both lacking in certain teratozoospermias, possibly indicating a central role for 
calicin in the organization of this structre (Courtot et al., Mol. Reprod. Dev. (28)272-279, 1991). In 
Schizosaccharomyces potnbe, Ral2p acts down-stream of Raslp in pathways that affect cell 
morphology, conjugation and sporulation. The spherical morphology and mating defects of ral2-null 
cells are complemented by overxpression of Raslp, indicating a close functional interaction 

25 between the two proteins (Fukui et al., Mol. Cell. Biol. (9)5617-5622, 1989). The transcription 
factor Nrf2 is sequestered by the kelch-repeat containing Keapl protein under normal cellular 
conditions. The stimulation by agents such as diethylmaleate induces the translocation of Nrf2 to 
the nucleus to initiate the cytoprotective electrophilic counterattack response (Itoh et al., Genes 
Dev. (13)76-86, 1999). Lytic infection of cells by herpes simplex virus is initiated by binding of 

30 virally encoded VP 16 to HCF-1, a protein thought to have a normal role in cell-cycle progression. 
The HCF-VP16 complex then assembles with Oct-1 transcription factor on cis-regulatory targets in 
the HSV genome to initiate virus replication. (Wilson et al., Mol. Cell. Biol (17)6139-6146, 1997; 
Hughes et al., J. Biol. Chem. (274)16437-16443, 1999). Two recently discovered mammalian kelch- 
repeat proteins have extracellular roles. Human attractin appears to participate in normal immune 

35 defence as a serum glycoprotein released by activated T cells. In coculture assays, attractin 

stimulates adhesion and spreading of monocytes, facilitating the development of T-cell clusters and 
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cellular immune responses (Duke-Cohan et al., Proc. Natl. Acad. Sci. U. S. A. (95)1 1336-1 1341, 
1998). Attractin is orthologous to the extracellular domain of mouse mahogany, a large, 
multidomain, transmembrane protein that has been implicated in the homeostasis of energy 
metabolism by its suppressive effects on certain types of obesity in mice (Gunn et al., Nature 
5 (398)152-156, 1999; Nagle et al., Nature (398)148-152, 1999). 

Evidence for the importance of kelch repeat beta-propellers in protein function has also 
come from studies of natural an engineered loss-of-function mutations. Caenorhabditis elegans 
Spe-26 mutant spermatocytes fail to complete meiosis, contain multiple nuclei and show gross 
disorganization of actin filaments and organelles. For five out of the six alleles that have been 

10 examined in detail, the mutations map within the kelch repeats (Varkey et al., Genes Dev. (9)1074- 
1086, 1995). Of particular interest are the point mutations in RAG-2 that have been identified in 
some cases of human B-cell-negative severe combined immuno-deficiency or of Omenn syndrome 
(Schwarz et al., Science (274)97-99, 1996; Villa et al., Cell (93)885-896, 1998). Very recently, the 
gigaxonin, a new member of the cytoskeletal BTB/kelch repeat family, is described as mutated in 

15 giant axonal neuropathy (Bomont et al., Nat. Genet. (26) 370-374, 2000). 

It is believed that the proteins of the invention are members of the kelch superfamily and, 
such as, play a role in the association with the actin cytoskeleton, the organization of cytoskeletal, 
plasma membrane or organelle structures, the coordination of morphology and growth, the gene 
expression, the viral pathogenesis, the immune defence. In particular, the proteins of invention are 

20 highly expressed in brain and is believed to be related to the CNS disorders. Preferred polypeptides 
of the invention are polypetides comprising the amino acids of the proteins of invention at positions 
20-66, 68-1 14, 1 16-162, 164-209, 21 1-265 and 270-316. In one embodiment, the proteins of the 
invention or part thereof are used to modulate actin organization in cells thus affecting the 
cytoskeleton and cell function in general. 

25 The invention also features compounds, e.g., proteins, which interact with the protein of the 

invention. Any method suitable for detecting protein-protein interactions may be employed for 
identifying transmembrane proteins, intracellular, or extracellular proteins that interact with the 
protein. Among the traditional methods which may be employed are co-immunoprecipitation, 
crosslinking and co-purification through gradients or chromatographic columns of cell lysates, or 

30 proteins obtained from cell lysates, and the use of the proteins of the invention to identify proteins 
in the lysate that interact with it. For these assays, the protein of the invention can be full length or 
some other suitable protein polypeptide fragment. Once isolated, such an interacting protein can be 
identified and cloned and then used, in conjunction with standard techniques, to identify proteins 
with which it interacts. For example, at least a portion of the amino acid sequence of a protein 

35 which interacts with the protein of the invention can be ascertained using techniques well known to 
those of skill in the art, such as via the Edman degradation technique. The amino acid sequence 
obtained may be used as a guide for the generation of oligonucleotide mixtures that can be used to 
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screen for gene sequences encoding the interacting protein. Screening may be accomplished, for 
example, by standard hybridization or PCR techniques. Techniques for the generation of 
oligonucleotide mixtures and the screening are well-known. ("PCR Protocols: A Guide to Methods 
and Applications," Innis et al., eds. Academic Press, Inc., NY, 1990). 
5 Additionally, methods can be employed which result directly in the identification of genes 

which encode proteins that interact with the protein of the invention. These methods include, for 
example, screening expression libraries, in a manner similar to the well known technique of 
antibody probing of lambda.gtl 1 libraries, using labeled polypeptide or a protein fusion protein, 
e.g., a protein polypeptide or domain fused to a marker such as an enzyme, fluorescent dye, a 

10 luminescent protein, or to an IgFc domain. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention or part thereof to modulate actin organization and related cytoskeletal protein 
organization in cells and in particular, cells of the CNS. Compositions containing the protein of the 
invention and fragments thereof may be therapeutic and used to treat a variety of neuronal 

15 disorders. An additional embodiment of the invention relates to the administration of a 

pharmaceutical or sterile composition, in conjunction with a pharmaceutical^ acceptable carrier, for 
any of the therapeutic effects discussed. The composition may be delivered via a variety of different 
routes. 

In yet another embodiment, an antagonist of the protein of the invention may be 

20 administered to a subject to treat or prevent a neuronal disorder. Such a disorder may include, but is 
not limited to, akathesia, Alzheimer's disease, amnesia, amyotrophic lateral sclerosis, bipolar 
disorder, catatonia, cerebral neoplasms, dementia, depression, diabetic neuropathy, Down's 
syndrome, tardive dyskinesia, dystonias, epilepsy, Huntington's disease, peripheral neuropathy, 
multiple sclerosis, neurofibromatosis, Parkinson's disease, paranoid psychoses, postherpetic 

25 neuralgia, schizophrenia, and Tourette's disorder. In one aspect, an antibody which specifically 
binds the protein of the invention may be used directly as an antagonist or indirectly as a targeting 
or delivery mechanism for bringing a pharmaceutical agent to cells or tissue which express the 
protein of the invention. 

Another embodiment of the invention encompasses DNA sequences which encode the 

30 proteins of the invention that may be derived through biological or synthetic chemistry processes. 
Polynucleotides sequences capable of hybridizing to the cDNA sequences of SEQ ID NOs: 144 and 
175are also included in the scope of the invention. 

Further included in the invention are the polypeptides encoded by the human cDNA of 
clones 1 05-02 1-3-0-C3-CS and 188-31-1-0-E6-CS. The polypeptides of SEQ ID NOs: 385 and 416 

35 may be interchanged with the corresponding polypeptides encoded by the human cDNA of clones 
1 05-02 1-3-0-C3-CS and 188-31-1-0-E6-CS. Further included in the invention are polynucleotides 
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encoding said polypeptides. Preferred polynucleotides are those of SEQ ID NOs: 144 and 175 and 
of the human cDNA of clones 1 05-02 1-3-0-C3-CS and 188-31-1-0-E6-CS. 

Another embodiment of the invention to methods of using the nucleotidic sequence or part 
thereof of invention to search homologs to the protein of invention. The sequence can be used as 
5 template of PCR reactions, allowing the detection/quantification of the protein of invention or part 
of thereof or the homologs. The complementary sequence or part of thereof may be used as 
hybridization probes to detect/quantify the transcription level, as well in in vitro level as the cellular 
level. In particular, such probes may be used in a diagnostic context, for example in the cellular or 
tissue in situ hybridization. 
10 Another embodiment of the invention to methods of using the nucleotidic sequence or part 

thereof of invention to design antisense oligonucleotides to modulate the in vitro or in vivo 
expression of the protein or the part or thereof gene expression. This may be useful in therapeutic 
area of diseases listed above, particularly in the context where the protein of invention is expressed 
in abnormally high level. 

15 In a further embodiment of the invention, the protein of invention or portions thereof are 

used to produce specific antibodies. These antibodies may have applications in the diagnostics, the 
purification of the protein of invention or part of thereof or a homolog. They may also help to 
vizualize the locations of the proteins associated to the protein of invention in the cell, in particular 
for highlighting structure -related proteins. The methods described herein in which protein 

20 antibodies are employed may be performed, for example, by utilizing pre-packaged diagnostic kits 
comprising at least one specific cDNA of SEQ ID NO. or antibody reagent described herein, which 
may be conveniently used, for example, in clinical settings, to diagnose patients exhibiting 
symptoms of various CNS disorders. 

In still another preferred embodiment, the present inventions relates to methods of using the 

25 protein of the invention or part thereof in gene therapy, particularly in the diseases involving the 
CNS, particularly in the context where the protein of the invention is expressed in abnormally low 
level. Gene therapy is a potential therapeutic approach in which normal copies of the cDNAs of 
SEQ ID NOs: 144 and 1 75 may be introduced into subjects to successfully code for normal protein 
in several different affected cell types. 

30 Another aspect of the present invention includes a formulation comprising the protein of the 

invention or part thereof and a pharmaceutically or physiologically acceptable carrier. A 
formulation of the present invention comprises a combination of one or more peptides as described 
herein, or mimetopes thereof; a combination of antibodies as described herein, or mimetopes 
thereof; or a combination of antibodies and peptides as described herein, or mimetopes thereof. 

35 Such a formulation may be administered to a subject in need thereof to treat or prevent a disorder 
associated with decreased expression or activity of the protein. Examples of such disorders include 
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but are not limited to those of the CNS and other tissues where the association of actin appears to be 
abnormal. 



Proteins of SEP ID NO: 391 393. 405 and 407 (internal designations 145-52-2 -0-D12-CS. 145-7- 
3-0-D3-CS. 174-1 7-1 -0-D6-CS and 174-38-4-0-D1 1-CS respectively^ 
5 The cluster of four proteins (SEQ ID NOs: 391, 393, 405 and 407) encoded by the cDNAs of 

SEQ ED NOs: 150, 152, 164 and 166 respectively exhibit very strong homology to claudin-8, a member 
of PMP22-Claudin family (PF00822). SEQ ID NO: 405 (174-1 7-1-0-D6-CS) shares a high degree of 
identity with human claudin-8. SEQ ID NO: 393 contains two amino acid substitutions as compared to 
human claudin-8 (T129A and SI 5 IP); thus, it appears to be a polymorphic variant of claudin-8. SEQ 

10 ID Nos: 393 and 405 contain four membrane spanning segments. 

SEQ ID NOs: 391 and 407 are polymorphic forms of claudin-8. The protein of SEQ ID NO: 
391 (145-52-2 -0-D12-CS) is 162 amino-acids long, contains three theoretical membrane spanning 
segments, and shows three amino acid substitutions as compared to the previously identified claudin-8 
protein (R31I, S151P and E162). The protein of SEQ ID NO: 407 ( 1 74-3 8-4-0-D1 1-CS) is 43 amino- 

1 5 acids long. SEQ ID NO: 407 contains a stop codon at position 44 and contains no apparent membrane 
spanning segments. 

The Claudin family of proteins comprises more than twenty (20) small glycoproteins with four 
predicted transmembrane domains. The tissue distribution pattern of claudins varies significantly, 
depending on claudin species. Many have been identified as components of tight junction (TJ) strands 

20 which contribute in regulation of cell polarity and permeability. Polarized epithelial and endothelial 
cells form barriers that separate biological compartments and regulate homeostasis. The tight junction 
(TJ) is a specialized membrane domain at the most apical region of polarized epithelial and endothelial 
cells that not only creates a primary barrier to prevent paracellular transport of solutes (barrier function) 
but also restricts the lateral diffusion of membrane lipids and proteins to maintain the cellular polarity 

25 (fence function). Tight junctions appear to represent a continuous network of interconnected rows of 
intramembranous particles that appear as strands with complementary grooves. The TJ-specific integral 
membrane proteins, i.e. the components of TJ strands, occludins and claudins, were only recently 
identified. 

Claudin- 1 and -2 have the ability to induce the formation of networks of strands/grooves at 
30 cell-cell contact sites when introduced into fibroblasts lacking TJs. Occludin induces only a small 
number of short strands at cell-cell contact sites in fibroblasts, thus, it is an accessory protein in terms of 
TJ strand formation. Claudin transfection experiments in fibroblasts revealed the TJ strand itself can be 
formed without occludin (Saitou M et al. J. Cell Biol. 141 : 397^08 (1998), Furuse M et al. J. Cell Biol. 
143:391^01 (1998)). 

35 Initially several members of the claudin family were reported (RVP1, Clostridium perfringens 

enterotoxin receptor (CPE-R ), and TMVCF (transmembrane protein deleted in Velo-cardio-facial 
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syndrome)), but their physiological functions were not determined. After the identification of claudin-1 
and -2 as novel components of TJ strands (Furuse, M. et al. J. Cell Biol. 141, 1539-1550 (1998)), CPE- 
R was shown to remove specific claudins from TJs. In its presence, TJ strands in C3L cells gradually 
disintegrate and the number of TJ strands and the complexity of their network decreases markedly 
5 (Sonoda N et al. J Cell Biol 147(1): 195-204 (1999)). In distal tubules of the kidney, claudin^ (CPE-R) 
and claudin-8 were co-localized with occludin at their junctional complex region. In liver, claudin-3 
and occludin were co-localized along bile canaliculi and TJ strands were labeled heavily and 
specifically with anti-claudin-3 Ab (Morita et al. PNAS 96 (2): 511-516 (1999)). The claudins have 
been shown to create the paracellular diffusion barrier and, surprisingly, they may also confer channel- 

10 like selectivity for passage of solutes through the tissue barrier (Anderson JM and Christina M. Van 
ltallie CM, Current Biology 9:R922-R924 (1999)). 

The existence of the claudin multigene family as well as the tissue distribution pattern of each 
claudin species suggests that similar complexity can be expected in TJs and contributes to the 
generation of functional diversity of TJs in vivo. More than two distinct claudins are co-expressed in 

15 single epithelial cell. Claudins interact between each of the paired strands in a heterophilic manner and 
distinct claudins are (except in some combinations) co-incorporated into individual TJ strands (Furuse 
M et al. J Cell Biol 147(4): 89 1-903 (1999)). 

Several claudins have been shown to be expression markers of malignant cells. For example, 
SEMP1 (senescence-associated epithelial membrane protein) is expressed in normal human tissues, 

20 including adult and fetal liver, pancreas, placenta, adrenals, prostate and ovary; however, SEMP1 is 
expressed at low or undetectable levels in a number of human breast cancer cell lines (Swisshelm K et 
al. Gene 226:285-295 (1999)). Another member of the claudin family was found to be exclusively 
expressed in MCF-7ADR mammary carcinoma cells. MCF-7ADR carcinoma cells are estradiol- 
independent for growth, estrogen-receptor negative, tamoxifen resistant, vimentin positive and invasive 

25 in vitro and in vivo (Schiemann S et al., Anticancer Res 17(lA):13-20 (1997)). Further, down 
regulation of the expression of claudin-1 has been associated with oncogenesis in rat salivary gland 
epithelium cells (Li D and Mrsny RJ J Cell Biol 148(4):791-800 (2000)). 

The increase in microvascular permeability in human gliomas, contributing to clinically severe 
symptoms of brain edema, appears to be the result of a dysregulation of junctional proteins. Increased 

30 TJ permeability of the colon epithelium, and consequently a decrease in epithelial barrier function, 
precedes the development of colon tumors, including carcinomas and adenomatous polyps (Soler AP et 
al. Carcinogenesis 20(8): 1425-1431 (1999)). Studies of the interendothelial junctions in tumor 
microvessels of human glioblastoma multiforme show that the expression of claudin-1 is lost in the 
majority of tumor microvessels, whereas claudin-5 is significantly down-regulated only in hyperplastic 

35 vessels. A relationship between claudin-1 suppression and the alteration of tight junction morphology is 
likely to correlate with the increase of endothelial permeability (Liebner S et al. Acta Neuropathol 
(Berl) 100(3):323-33 1 (2000)). 
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The human phenotype of mutations in claudin-16 suggests that it creates a channel allowing 
magnesium to diffuse through renal tight junctions. Similarly, a mouse knockout of claudin-1 1 reveals 
its role in formation of tight junctions in myelin and between Sertoli cells in testis (Mitic LL et al. Am J 
Physiol Gastrointest Liver Physiol 279(2):G250-254 (2000)). 
5 Opening of TJs by environmental proteinases may be the initial step in the development of 

asthma to a variety of allergens. The lung epithelium fonms a barrier that allergens must cross before 
they can cause sensitization. The cysteine proteinase allergen Der p 1 from fecal pellets of 
Dermatophagoides pteronyssinus (the house dust mite (HDM)) causes disruption of intercellular tight 
junctions (TJs), which are the principal components of the epithelial paracellular permeability barrier. 

10 TJ breakdown nonspecifically increases epithelial permeability, allowing Der p 1 to cross the epithelial 
barrier. Putative Der p 1 cleavage sites were found in peptides from an extracellular domain of claudin- 
1 . House dust mite (HDM) allergens are important factors in the increasing prevalence of asthma (Wan 
H et al. J Clin Invest 104(1): 123-33 (1999)). 

In many intestinal and systemic diseases, intestinal barrier damage is marked by changes in 

1 5 intestinal permeability which are, in turn, related to alteration in tight junction function (Gasbarrini G, 
Montalto Mltal J Gastroenterol Hepatol 31(6):481-488 (1999)). Permeability of the tight junctions can 
be modified by bacterial toxins, cytokines, hormones and drugs. Oligodendrocyte-specific protein 
(OSP/claudin-1 1), found in CNS myelin, appears to be a promising candidate for auto-antigenic 
involvement in autoimmune demyelinating disease. The presence of anti-OSP Abs in the cerebrospinal 

20 fluid was reported for relapsing-remitting multiple sclerosis (MS). Murine OSP peptides elicit clinical 
experimental autoimmune encephalomyelitis in animal models for MS and induces mononuclear cell 
infiltrates and focal demyelination. Also OSP peptides elicit robust proliferative responses in T cells 
(Stevens DB et al. J Immunol 162:7501-7509 (1999)). OSP/claudin-1 1 appears to modulate 
proliferation and migration of oligodendrocytes, presumably through the membrane interactions at tight 

25 junctions and with the extracellular matrix (Bronstein JM et al. J Neurosci Res 59(6):706-71 1 (2000)). 
Recently claudin-1 1 has been shown to play a key role in the formation of hematotesticular barrier; it is 
regulated by FS hormone and by cytokines in early fetal and postnatal development in Sertoli cells 
(Hellani A. et al. Endocrinology 141: 3012-3019 (2000)). 

SEQ ID NOs: 391, 393, 405 and 407 are new human proteins having biological activities 

30 described for claudins. Nucleic acids encoding the proteins of interest are over represented in fetal 
kidney and in salivary gland. The subject invention provides polynucleotides encoding the proteins of 
SEQ ID Nos: 391, 393, 405 and 407. In one embodiment, the polypeptides of SEQ ID NOs: 391, 393, 
405 and 407 are interchanged by the polypeptides encoded by clones 1 45-52-2 -0-D12-CS, 145-7-3-0- 
D3-CS, 174-1 7-1 -0-D6-CS and 174-38-4-0-D1 1-CS. Also provided are use of these proteins, 

35 fragments, derivatives thereof (and related polynucleotides) for the diagnosis, treatment, or prevention 
of tumors and another diseases, including disorders associated with altered epithelial function. The 
invention also encompasses possible variants of the proteins of interest which have at least about 80%, 
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more preferably at least about 90%, and most preferably at least about 95% amino acid sequence 
identity to the amino acid sequence, provide the variants have at least one of the functional or structural 
characteristics of the identified claudin-like proteins. 

In one embodiment of the subject invention, the proteins of interest, or biologically active 
5 fragments or variants thereof, may be administered to a subject to treat or prevent disorders of salivary 
gland, kidney and prostate. The subject invention also provides therapeutic regimens for the treatment 
of epithelial dysfunction and cancer. 

The disorders which may be treated in accordance with the subject invention include, but are 
not limited to, asthma, eczema, atopic dermatitis, contact dermatitis, stasis dermatitis, seborrheic 

10 dermatitis, psoriasis, lichen planus, pityriasis rosea, acne vulgaris, acne rosacea, pemphigus vulgaris, 
pemphigus foliaceus, paraneoplastic pemphigus, bullous pemphigoid, herpes gestationis, dermatitis 
herpetiformis, linear IgA disease, epidermolysis bullosa acquisita, dermatomyositis, lupus 
erythematosus, scleroderma, and morphea; gastritis, peptic ulcers, cholelithiasis, cholecystitis, 
pancreatitis, cirrhosis, ulcerative colitis, Crohn's disease, and irritable bowel syndrome; Addison's 

15 disease, Lowe's syndrome, glomerulonephritis, chronic glomerulonephritis, tubulointerstitial nephritis, 
inherited X-linked nephrogenic diabetes insipidus, autosomal dominant polycystic kidney disease, 
autoimmune demyelinating disease, multiple sclerosis, glioma, and other tumors. 

A further aspect of the invention provides a method for treating these and/or other pathological 
states by administering, to a patient, a therapeutically effective amount of one or more of the proteins of 

20 interest. The proteins of interest may, optionally, be simultaneously or sequentially administered in 
conjunction with cytokines and/or interleukins which have been shown to improve claudin expression. 

In another embodiment, a vector capable of driving expression of one or more of the proteins 
of interest, or a biologically active fragment or variant thereof, may be administered to a subject to treat 
or prevent an epithelial permeability disorder including, but not limited to aforementioned disorders. 

25 Another embodiment of the subject invention provides compositions and methods of treating, 

or reducing the incidence of, asthma comprising the administration of therapeutically effective amounts 
of the proteins of the subject invention. In one embodiment, purified fragments, or synthetically 
modified peptides, derived from the extracellular domains of the proteins of interest may be 
administered in the therapeutic regimen. The peptides, containing the putative cleavage sites for 

30 environmental allergen proteinases, may be administrated in amounts which competitively inhibit the 
proteinase activity of the allergen. The peptides may be designed to bind allergen, optionally in an 
irreversible manner, and inhibit proteinase activity. 

The negative effects of the usual preservation solutions on epithelial and endothelial 
permeability in organs to be transplanted are generally known (Trocha S.D. et al. Ann.Surg. 230: 105- 

35 113 (1999)). Increases in permeability leads to tissue injury and edema. Disorganization of tight 
junctional proteins appears to be responsible for the observed tissue injury and edema. Thus, in another 
embodiment, purified proteins of interest, or variants and/or biologically active fragments thereof, may 
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be added in organ preservation solutions to maintain the content and integrity of tight junctions in 
organs. 

In another embodiment, the subject invention provides methods of producing "bioartificial" 
epithelia from non-epithelial cells. The "bioartificiar' epithelia produced according to the invention 
5 may be used for reconstructive surgical procedures, for treating of disorders related to epithelial loss 
(for hereditary, traumatic or oncological reasons) or for another therapeutic purposes (e.g., bum 
treatments). "Bioartificial" epithelial cells can be obtained by transfection and remodeling of the 
autologous patient cells not affected by any of the aforementioned disorders. The use of autologous 
cells in the preparation of the "bioartificiar epithelial cells of the invention in methods of treating 

10 disorders, conditions, or diseases associated with the loss of epithelial cells reduces or eliminates the 
risk of tissue rejection typically observed in transplantation methodologies. Methods of bioartificial 
tissue engineering are generally known to those skilled in the art (for a review, see Machens H.G. et al. 
Cells Tissues Organs 167: 88-94 (2000)). 

In another embodiment of the subject invention provides antibodies which specifically bind to 

15 the proteins of SEQ ID Nos: 391, 393, 405 and 407. The antibodies may also specifically bind to 
fragments or variants of the proteins described in SEQ ID Nos: 391, 393, 405 and 407. The antibodies 
of the invention may be used to detect the protein of interest in human body fluids, extracts of cells or 
tissue extracts. The detection assays may be used for epithelial cancer prognosis and for the diagnosis 
of disorders. The assays may also be used to monitor patients being treated with the proteins of 

20 interest. 

In another embodiment, the polynucleotide sequences, or fragments of said polynucleotide 
sequences, encoding the proteins of interest may be used for the identification or diagnosis of a disorder 
associated with expression of the proteins of SEQ ID Nos: 391, 393, 405 and 407. Hybridization 
assays which allow for the detection of polynucleotide sequences of the invention are well known to the 
25 skilled artisan. These assays include, and are not limited to, Northern blots, Southern blots, and PCR 
methodologies. 

Another embodiment of the invention provides the proteins of SEQ ID Nos: 391, 393, 405 and 
407, variants, immunogenic fragments, or biologically active fragments of said proteins for screening 
libraries of compounds in any of a variety of drug screening techniques. The proteins of SEQ ID Nos: 

30 391, 393, 405 and 407, variants, immunogenic fragments, or biologically active fragments of said 
proteins employed in such screening may be free in solution, affixed to a solid support, recombinantly 
expressed on, or chemically attached to, a cell surface, or located intracellularly. The formation of 
binding complexes between the protein of interest and the agent being tested may be measured by 
methods well known to those skilled in the art. 

35 Yet another embodiment of the invention provides methods of screening compounds which 

modulate epithelial permeability Polynucleotides encoding the proteins of SEQ ID Nos: 391, 393, 405 
and 407, variants, immunogenic fragments, or biologically active fragments of said proteins, may be 
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recombinantly expressed in cells typically lacking TJs according to methods discussed supra. These 
cells may then be used to assess therapeutic modulators (based, for example, on CPE -like compounds) 
for the ability to increase or decrease epithelial cell permeability. Compounds identified in these 
modulator screen assays may then be used in therapeutic protocols to adjust epithelial cell permeability 
5 as desired by the practitioner. 

The intestinal epithelium is a major barrier to the absorption of hydrophilic drugs. The 
presence of intercellular junctional complexes, particularly the tight junctions, renders the epithelium 
impervious to hydrophilic drugs, which cannot diffuse across the cells through the lipid bilayer of the 
cell membranes (Ward PD et al. Pharmaceutical Science and Technology Today 3:10:346-358 (2000)). 

10 Therefore, in another embodiment of the subject invention the proteins of SEQ ID Nos: A, B, C, or D, 
variants, or biologically active fragments of said proteins and their molecular partners can be used for 
the rational design of compounds that can effectively and safely increase paracellular permeability for 
selected drugs. For example, polynucleotides encoding the proteins of interest or any fragment or 
derivatives thereof, may be used for these purposes. In one aspect, the complement of the 

1 5 polynucleotide encoding the protein of interest may be used in situations in which it would be desirable 
to block the transcription of the mRNA encoding the proteins of interest, especially for temporally 
increasing epithelial permeability (useful for drug delivery). Alternatively, sense or antisense 
oligonucleotides may be designed from various locations along the coding or control regions of 
polynucleotide sequences encoding the proteins of SEQ ID Nos: 391, 393, 405 and 407, as well as 

20 variants, or biologically active fragments of said proteins to control expression of the proteins. 
Methods of producing and using sense and antisense oligonucleotides are well known to those skilled 
in the art. 

Claudins are unique proteins with specific protein-binding properties. Therefore, in another 
preferred embodiment, the proteins of SEQ ID Nos: 391, 393, 405 and 407, variants, or biologically 

25 active fragments of said proteins may be used as a component of drug delivery vehicles such as colloids 
or liposomes. The proteins of the proteins of SEQ ID Nos: 391, 393, 405 and 407, variants, or 
biologically active fragments of said proteins may be incorporated into the lipid membranes of 
liposomes and can serve as specific targeting agents which bind the specific epithelial targets and 
facilitate targeted epithelium drug delivery. The methods of design of such type of drug delivery 

30 systems is known by those skilled in the art (Smith H.J. Introduction to the principles of drug design 
and action, 3 rd ed. (1998)). Alternatively, active agents, such as chemotherapeutic agents, 
radioisotopes, prodrugs, may be directly attached, recombinantly or chemically, to the proteins of SEQ 
ID NOs: 391, 393, 405 and 407, variants, or biologically active fragments of said proteins and used in 
therapeutic regimens. 
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Proteins of SEP ID Nos: 278, 282 and 300 (internal designations 160-37-2-0^H7-CS. 174-33-3-0- 
F6-CS. 184^4-1-0-Al 1-CS respectively) 

The protein of SEQ ID No: 278 (and the corresponding allelic variants 282 and 300) encoded 
by the cDNA SEQ ID No: 37 (41, 59 respectively) shows homology to a human transmembrane 
5 protein (HTMN-23, Genseq accession number Y57899). The protein of SEQ ID No: 278 (and the 
corresponding polymorphic variants 282 and 300) overexpressed in salivary gland contains 9 potential 
transmembrane segments from positions 85 to 105, 1 16 to 136, 164 to 184, 187 to 207, 332 to 352, 376 
to 396, 404 to 424, 465 to 485 and 499 to 519, thus displaying chracteristic features of type III 
transmembrane proteins (Singer, Annu. Rev. Cell Biol., 6:247-296, 1990). Furthermore, a predicted 
1 0 localisation in the endoplasmic reticulum (ER) is found for the protein of the invention with the 
software psort. 

The normal fonctioning of the eukaryotic cell requires that all the newly synthesized proteins 
be correctly folded, modified, and delivered to specific intra- and extracellular sites. Newly synthesized 
membrane and secretory proteins enter a cellular sorting and distribution network during or 

15 immedialety after synthesis and are routed to specific locations inside and outside the cell. The initial 
compartment in this process is the endoplasmic reticulum (ER) where proteins undergo modifications 
such as glycosylation, disulfide bond formation, and assembly into oligomers. The modified proteins 
are then transported through a series of membrane-bound compartments which include the various 
cistemae of Golgi complex where further carbohydrate modifications occur. Transport between 

20 compartments occurs by means of vesicles that bud and fuse in a manner specific to the type of protein 
being transported. Once within the secretory pathway, proteins do not have to cross a membrane to 
reach the cell surface. Disruptions in the cellular secretory pathway have been implicated in several 
human diseases. In familial hypercholesterolemia the low density lipoprotein receptors remain in the 
ER rather than moving to the cell surface (Pathak, et al., J. Cell Biol., 106:1831-1841, 1988). Altered 

25 transport and processing of the beta-amyloid precursor protein (betaAPP) involves the putative vesicle 
transport protein prenesilin, and may play a role in early-onset Alzheimer disease (Levy-Lahad et al., 
Science, 269:973-977, 1995). Changes in the ER-derived calcium homeostasis have been associated 
with diseases such as cardiomyopathy, cardiac hypertrophy, myotonic dystrophy, Brody disease, 
Smith-McCort dysplasia and diabetes mellitus. 

30 It is believed that the protein of the invention represents a new ER integral transmembrane 

protein. This protein plays probably a role in post-translational modifications of secreted and 
membrane proteins. Its dysregulated expression may be linked to disorders such as the above referred 
diseases. Preferred polypeptides of the invention are polypeptides comprising the amino acids of SEQ 
ID Nos: 278,282 and 300 from positions 85 to 105, 116 to 136, 164 to 184, 187 to 207, 332 to 352, 376 

35 to 396, 404 to 424, 465 to 485 and 499 to 5 19. Other preferred polypeptides of the invention are 
fragments of SEQ ID Nos: 278, 282 and 300 having any of the biological activity described herein. 
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One object of the present invention are compositions and methods of targeting heterologous 
polypeptides to the endoplasmic reticulum by recombinantly or chemically fusing a fragment of the 
proteins of the invention to an heterologous polypeptide. Preferred fragments are any fragments of the 
proteins of the invention, or part thereof, that may contain targeting signals for the endoplasmic 
5 reticulum such as those described in Pidoux AL, Armstrong EMBO J 1992 Apr;l 1(4): 1 583-91 ; Munro 
S, Pelham HR Cell 1987 Mar 13;48(5):899-907; Pelham HR Trends Biochem Sci 1990 
Dec;15(12):483-6. 

In another embodiment, the invention relates to methods and compositions using the protein 
of the invention or part thereof as marker proteins to selectively identify tissues, preferably salivary 

10 glands. For example, the proteins of the invention or part thereof may be used to synthesize specific 
antibodies using any techniques known to those skilled in the art including those described therein. 
Such tissue-specific antibodies may then be used to identify tissues of unknown origin, for example, 
forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to 
differentiate different tissue types in a tissue cross-section using immunochemistry. 

15 Moreover, antibodies to the proteins of the invention or parts thereof may be used for the 

detection of endoplasmic reticulum in immunochelistry for example using any techniques known to 
those skilled in the art including those described herein. 

Protein of Seq Id No: 281 (174-10-2-F8-CS) 

The protein of SEQ ID No: 281 is homologous to PET1 17 (SwissProt ED: Q02771). MTC 
20 is overexpressed in the brain, dystrophic muscle, fetal liver, placenta and salivary glands. 

The protein of the invention, herein named MTC, presents a certain homology with the 
yeast PET1 17 protein precursor (22 % identical amino acids, 39% positive amino acids when 
aligned by BLASTP 2.0.9). MTC appears to be a novel member of the PET family. 

Cytochrome c oxidase (complex IV), an enzyme complex located in the mitochondrial inner 
25 membrane, is the terminal member of the mitochondrial electron transport chain. The oxidation 
reaction catalyzed by cytochrome c oxidase is exergonic and is coupled to the translocation of 
protons across the membrane. This reaction provides the energy needed to drive the synthesis of 
ATP by the mitochondrial oxidative phosphorylation system and is essential for respiratory 
metabolism in aerobic eukaryotes. Cytochrome c oxidase is made up of as many as 13 non- 
30 identical protein subunits, of which 3 are encoded by the mitochondrial genome, and contains 
several prosthetic groups (including heme groups a and a 3 ). 

The composition of cytochrome c oxidase requires that synthesis and assembly of a 
functional enzyme complex occur in several distinct steps, including: 1) synthesis of the protein 
subunits, 2) transport of the subunits from their site of synthesis to their site of function in the 
35 mitochondrial inner membrane, 3) synthesis of hemes a and a 3 and, 4) assembly of the subunits with 
each other and with the prosthetic group. A number of "accessory" genes (e.g., not encoding 
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protein subunits of the final assembled cytochrome c oxidase complex) are required for the 
production of functional cytochrome c oxidase (McEwen JE et al. - J Biol Chem - 1986, 
261 (25): 1 1872-9). Some of these are required for the expression of mitochondrial-encoded 
cytochrome c oxidase subunits, while others are needed for the proper assembly of active 
5 cytochrome c oxidase. 

The nuclear genes PET1 17 and PET 191 belong to this class of "accessory genes" required 
for the assembly of active mitochondrial cytochrome c oxidase (McEwen JE et al. - Curr Genet. - 
1993, 23(1):9-14). The role of PET genes and the proteins that they encode remains obscure, 
although mutation experiments in S cerevisae have clearly shown that they are essential for the 

10 production of active cytochrome c oxidase (McEwen JE et al. - J Biol Chem — 1986, 
261(25): 1 1 872-9) (McEwen JE et al. - Curr Genet. - 1993, 23(1 ):9-14). 

One aspect of the subject invention provides to compositions and methods of using the 
nucleotide sequence of SEQ ID No: 40, or its complement, in molecular biology techniques. In one 
embodiment, the MTC2 sequence is encoded by clone 174-10-2-0-F8-CS. References to a 

1 5 polynucleotide of SEQ ID NO: 40 and polypeptide of SEQ ED NO: 281 are interchangeable with the 
corresponding polynucleotides of the human cDNA of clone 174-1 0-2 -0-F8-CS and polypeptides 
encoded thereby. These techniques include, but are not limited to: PCR; production of recombinant 
MTC, or biologically active fragments thereof, generating antisense RNA and DNA, their chemical 
analogs and the like; hybridization probes; and chromosome gene mapping. 

20 As is apparent to one skilled in the art, all of the non-limiting techniques listed above can be 

practiced with fragments of the mtc2 gene. Given the well-known nature of these techniques, the 
skilled artisan will be able to select an appropriate length of the mtc2 polynucleotide for use in the 
techniques. For recombinant expression of protein, a preferred embodiment provides the full length 
MTC2 gene in an expression vector. 

25 For example, nucleotide sequence of SEQ ID No: 40 or its complement can be used to 

generate hybridization probes for mapping the naturally occurring genomic sequence. The 
sequence can be mapped to a particular chromosome or to a specific region of the chromosome 
using well-known techniques. These include in situ hybridization to chromosomal spreads, flow- 
sorted chromosomal preparations, or artificial chromosome constructions such as yeast artificial 

30 chromosomes, bacterial artificial chromosomes, bacterial PI constructions or single chromosome 
cDNA libraries as reviewed in Price (Price CM -Blood Rev. - 1993, 7(2): 127-34) and TraskB 
(Trask BJ - Trends Genet. - 1991, 7(5): 149-54). 

In situ hybridization of chromosomal preparations and physical mapping techniques such as 
linkage analysis using established chromosomal markers are invaluable in extending genetic maps 

35 that provides valuable information to investigators searching for disease genes using positional 

cloning or other gene discovery techniques. Once a disease or syndrome has been crudely localized 
by genetic linkage to a particular genomic region, any sequences mapping to that area can represent 
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associated or regulatory genes for further investigation. The nucleotide sequence of the present 
invention can also be used to detect differences in the chromosomal location due to translocation, 
inversion, etc. among normal, carrier or affected individuals. 

The subject invention also provides methods of using MTC polypeptides and 
5 polynucleotides encoding said polypeptides in preventing or reducing the incidence of apoptosis in 
cells. Dysfunctions in the mitochondrial electron transport chain result in cellular apoptosis or 
necrosis. In one embodiment, MTC is added to an in vitro culture of mammalian cells in an amount 
effective to reduce apoptosis. In another embodiment, cells are transfected with vectors comprising 
MTC polynucleotides which cause the expression of MTC peptides. MTC used in these 

10 embodiments can, optionally, contain mitochondrial targeting sequences. In another embodiment, 
MTC or MTC2 are encoded by clone 174-10-2-0-F8-CS. 

In another embodiment, MTC polypeptides and polynucleotides encoding said polypeptides 
can be used in the diagnosis, treatment and/or prophylaxis of disorders associated with apoptosis or 
impairment of the mitochondrial respiratory electron transport chain. Polynucleotides can also be 

15 used in antisense protocols for certain disorders to impair the function of the mitochondrial electron 
transport chain. These disorders include, but are not limited to, immune deficiency syndromes 
(including AIDS); type I diabetes; pathogenic infections; cardiovascular and neurological injury; 
alopecia; aging; degenerative diseases such as Alzheimer's Disease, Parkinson's Disease, 
Huntington's disease; dystonia; Leber's hereditary optic neuropathy; schizophrenia; neonatal hepatic 

20 failure and ketoacidotic coma necrosis; and myodegenerative disorders such as "mitochondrial 
encephalopathy, lactic acidosis, and stroke" (MELAS), "myoclonic epilepsy ragged red fiber 
syndrome" (MERRF); mitochondriocytopathies, Leigh syndrome, fatal infantile 
cardioencephalomyopathy, ataxia; encephalopathies, aging, neurodegenerative diseases, 
myopathies, and cancers. As would be apparent to the routineer, these methods can be practiced 

25 with full length MTC polypeptides and polynucleotides encoding said polypeptides as well as 
biologically active fragments of the same which retain biological activity. 

For diagnostic purposes, the expression of the protein of the invention could be investigated 
using any of the Northern blotting, RT-PCR or immunoblotting methods well known to those 
skilled in the art. For prophylaxis and/or treatment purposes SEQ ID No: 40, its complement, or 

30 fragments of either, can be used to enhance electron transport and increase energy delivery using 
any of the gene therapy methods known to those skilled in the art. Likewise, SEQ ED NO: MTC2, 
its complement, and fragments of either can be used to impair electron transport and decrease 
energy delivery using any of the antisense methodologies known to those skilled in the art. 
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Protein of SEP ID NQ:392 (145-7-2-0-G5-CS) 

The protein of SEQ ID No:392, encoded by the cDNA of SEQ ID No:151, is homologous 
to Unc-1 8 proteins, also known as the STXBP or Sec-1 family. The protein of the invention is 
strongly expressed in the fetal kidney. 
5 Amino acids 89 to 107 of the protein of the invention present the EMotif signature for 

proteins of the Sec-1 family (BlocksPlus PF00995). Furthermore, BLAST analysis (BLASTP 
version 2.0.9) of the amino acid sequence of the invention reveals that it is homologous to a number 
of proteins belonging to the Unc-1 8/Sec-l family. Preferred polypeptides of the invention are those 
that comprise amino acids 94, 95, and/or 100, which are conserved in more than 80% of the Sec-1 

10 family members; and/or amino acids 43, 89, and/or 97, which are conserved in more than 60% of 
Sec-1 family members. Other preferred polypeptides of the invention are any fragment of SEQ ID 
NO:392 having any of the biological activities described herein. 

The normal function and organization of eukaryotic cells is dependent on transport of 
various vesicles that selectively shuttle membrane and cargo between distinct compartments of the 

15 secretory and endocytotic pathways. A number of key proteins involved in membrane targeting and 
exocytosis have been identified, and a fundamental set of interactions has been defined and placed 
into a model called the SNARE (Soluble N-ethylmaleimide-sensitive Attachment protein REceptor) 
hypothesis (Rotheman J - Nature - 1994, 372: p55-63). According to the SNARE hypothesis, 
vesicles dock to a target membrane through the interaction of complementary sets of vesicular (v- 

20 SNARE) and target (t-SNARE) membrane proteins. Our understanding of vesicle trafficking has, 
to a large extent, been facilitated by characterization of synaptic vesicles in neurons. In synaptic 
vesicle exocytosis, the vesicular protein synaptobrevin (also called Vesicle-Associated Membrane 
Protein; VAMP) is the v-SNARE, and the plasma membrane-associated protein SNAP-25 
(Synaptosomal-Associated Protein of 25 kDa) and syntaxin 1 function as t-SNARE. Formation of 

25 the SNARE complex (or core complex) is followed by recruitment of the cytosolic proteins alpha, 
beta and gamma SNAP (Soluble N-ethylmaleimide-sensitive Attachment Protein) and NSF (N- 
ethylmaleimide-Sensitive Factor), which are required for membrane fusion. Proteins from two gene 
families have been identified as key regulators of SNARE complex assembly. These include 
members of the small GTP-binding family (e.g. Rabs) and the Sec-1 family. The Sec-1 gene is one 

30 of ten genes identified as essential for the final stages of protein secretion in yeast (S. cerevisae). 
Sec-1 homologues have been identified in the nervous system of C. elegans (Unc-1 8), D. 
melanogaster (Rop) and mammals. In mammals, the protein has been termed Mammalian 
homologue of the Unc-1 8 gene (Munc-18), rbSecl (Rat Brain Seel) or n-Secl (neural-specific 
Seel). 

35 Sec-1 -related proteins are involved in the processes of vesicle targeting, docking and/or 

fusion. Sec-1 -related proteins interact directly with the t-SNARE syntaxin, and Munc-18 has been 
found to interact with syntaxin iso forms la, 2 and 3. However, Munc-1 8 has not been found to be 
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part of the 20S SNARE/SNAP/NSF protein complex. In vitro, the binding of Munc-18 to syntaxin 
inhibits the interaction of syntaxin with VAMP and SNAP-25 as well as SNAP-23 (a homologue of 
SNAP-25) and thereby negatively regulates the formation of the synaptic SNARE fusion complex. 
In agreement with a negative regulatory role of Sec- 1 /Munc-18 proteins in neurotransmitter release 
5 are results showing that microinjections of Sec-1 into the presynaptic terminal of the giant squid 
synapse inhibits evoked transmitter release (Dresbach T. et al. - J Neurosci. - 1998, 18: p2923- 
2932). Furthermore, overexpression of Rop, Unc-18, Sec-1 and Munc-18 all result in phenotypes 
associated with a complete block in neurotransmitter release and/or secretion (Hosono R. et al. - J 
Neurochem- 1992; 58: pi 5 17-1525; Harrison S. et al. -Neuron - 1994; 13: p555-566; Novick P. 

10 et al. - Cell - 1981; 25: p461-469; verhage M. et al. - Science - 2000; 287: p864-869). Point 
mutation experiments involving the Rop gene suggest that Rop is a rate-limiting regulator of 
exocytosis that performs both stimulatory and inhibitory functions in neurotransmission (Wu M. et 
al. - EMBO J - 1998; 17: pi 27-1 39). The reduction in neurotransmitter release seen after both 
overexpression of Munc-18 and mutations in Munc-18 homologues indicates that Sec-1 proteins not 

15 only sequester syntaxins from other proteins but also assist the syntaxins in adopting a functional 
conformation or facilitate interactions between syntaxins and other proteins by a chaperone-like 
action. The necessity of Seel -related proteins is believed to result, in part, from their direct and 
high affinity interaction with members of the t-SNARE family of syntaxin proteins and from the 
control by this complex of a v- and t-SNARE protein interaction required for vesicle fusion. 

20 The SNARE mechanism of exocytosis appears to be conserved both evolutionarily (most of 

the components have homologues in species from yeast to mammals) and functionally (each of the 
principal components are members of multigene families). This latter point is supported by work 
showing that components of this pathway are found in different cell types (neurons, neutrophiles 
and pancreatic beta-cells) (brumell J. et al. - J immunol - 1 995; 3 55: p5750-5759; Zhang W et al. - 

25 J Biol Chem. - 2000 Oct 6, electronic publication). 

It is believed that the protein of SEQ ID NO:392 is a member of the Unc-18/Sec-l family, 
and thus plays a key role in the regulation of various processes including vesicle targeting, docking 
and fusion. 

One embodiment of the present invention relates to the use of the protein of SEQ ID 
30 NO:392 or the cDNA of SEQ ID NO: 1 5 1 or any part thereof to used to identify fetal kidney tissue 
and cells derived from this tissue, since the protein of the invention is strongly expressed in this 
tissue. In addition, the protein of the invention can be used to specifically label components of the 
secretory pathway within cells. Assays for the detection of cells expressing the protein of the 
invention, or part thereof, can be developed using techniques known to those skilled in the art. For 
35 example, the protein of the invention, or part thereof, can be used to generate antibodies or 

antiserum, by techniques well known to those skilled in the art. Antibodies or antiserum can also be 
used for quantitative analysis or detection of the protein of the invention, by methods such as 
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enzyme -linked immunosorbant assays (ELISA) or by any other technique known to those skilled in 
the art. Another possible technique involves the use of marked syntaxins, since Seel -related 
proteins are known to bind to syntaxins. 

In another embodiment of the present invention, the present polynuleotides and 
5 polypeptides can be used to diagnose, treat, and/or prevent any of a large number of diseases and 
disorders characterized by abnormal exocylosis, such as, but not limited to: allergies including hay 
fever, asthma, and urticaria; neurologic disorders, a number of which result from abnormal 
neurotransmitter secretion (for example, depression is associated with decreased serotonin 
secretion); autoimmune hemolytic anemia; cancers, especially hormone -dependent cancers such as 

10 those stimulated by androgens (for example, prostate cancer) or estrogens (for example, breast 
cancer), leukemias or lymphomas; ulcerative colitis; type 2 diabetes, which in some cases is 
associated with decreased insulin secretion; proliferative granulonephritis; inflammatory bowel 
disease; growth failure due to decreased secretion of growth hormone; multiple sclerosis; 
myasthenia gravis, rheumatoid and osteoarthritis; scleroderma,; Chediak-Higashi and Sjogren's 

15 syndromes; systemic lupus erythematosus; thyroiditis; toxic shock syndrome; traumatic tissue 

damage; viral, bacterial, fungal and protozoal infections; and other physiologic/pathologic disorders 
associated with induced or otherwise abnormal vesicular trafficking. 

An association between the level of expression and/or activity of the present protein with 
the presence or absence of any condition associated with abnormal vesicular trafficking, such as any 

20 of the above-listed disorders, can readily be assessed by detecting the level of expression or activity 
of the protein by, e.g., Northern blot, western blot, ELISA, or any standard in vitro or in vivo assay 
for protein activity, and correlating the observed level or expression or activity with the presence or 
absence of the disorder. For those disorders found to be positively associated with the protein of the 
invention, a diagnostic or screening assay can be readily developed where the detection of an 

25 elevated level of protein or protein activity is indicative of the presence of the disease, or of a 
propensity to develop the disease. Further, any such diseases or conditions can be treated or 
prevented by inhibiting the expression or activity of the protein, for example by administering to a 
patient suffering from the disorder any inhibitor including, but not limited to, antibodies, antisense 
oligonucleotides, dominant negative forms of the protein, and small molecule inhibitors of protein 

30 expression or activity. Alternatively, disorders negatively associated with the protein of the 

invention can be diagnosed or screened for by detecting the level of the present protein or protein 
activity, where a decreased level of the protein or protein activity is indicative of the presence of the 
disease, or of a propensity to develop the disease. Such disorders negatively associated with the 
protein of the invention can be treated or prevented by increasing the level of the protein or protein 

35 activity, for example by administering to a patient any of a number of agents including, but not 

limited to, the protein itself, a polynucleotide encoding the protein, or a heterologous compound that 
enhances the expression or activity of the protein. 
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protein of SEP ID NO:419 (internal designation 1 88-9-1 -0-C10-CS) 

The protein of SEQ ID NO:419, highly expressed in the brain and placenta, is encoded by 
the cDNA of SEQ ID NO: 178, is localized preferentially in the endoplasmic reticulum, and is 
homologous to the yeast integral membrane protein SFT2p, a member of the SNARE-related family 
5 (Genbank accession number X79489). SFT2p is well conserved in C. elegans and in mice 

(accession numbers CAA93859 and AA790425 respectively), and plays an important role in the 
protein trafficking and fusion machinery of eukaryotic cells. The 159-amino-acid-long protein of 
the invention, which is similar in size and in membrane topology to the SFT2p protein, displays 
four conserved hydrophobic stretches from positions 36 to 56, 66 to 86, 98 to 118 and 122 to 142, 

10 forming a tetra-spanning membrane protein. This topology is also found in the Gotlp protein, 
another well-conserved SNARE related protein with similar functions to those of SFT2p protein 
(accession number ALO 10285 for P. falciparum, U23521 for C. elegans) as described in Conchon et 
al., EMBO J., 18(14):3934-3946 (1999). 

Eukaryotic proteins are synthesized within the endoplasmic reticulum (ER), are delivered 

15 from the ER to the Golgi complex for post-translational processing and sorting, and are transported 
from the Golgi to specific intracellular and extracellular destinations. This intracellular and 
extracellular movement of protein molecules is termed vesicle trafficking. Trafficking is 
accomplished by the packaging of protein molecules into specialized vesicles which bud from the 
donor organelle membrane and fuse to the target membrane (Palade, Science 189:347-358 (1975)). 

20 Numerous proteins are necessary for the formation, targeting, and fusion of transport 

vesicles and for the proper sorting of proteins into these vesicles. The vesicle trafficking machinery 
includes coat proteins which promote the budding of vesicles from donor membranes, vesicle- and 
target-specific identifiers (v-SNAREs and t-SNAREs) which bind to each other and dock the vesicle 
to the target membrane (Nichols et al., Nature 387:199-202, 1997), and proteins which bind to 

25 SNARE complexes and initiate fusion of the vesicle to the target membrane (SNAPs). 

SFT2p is a conserved yeast protein with four transmembrane domains that is resident in 
punctate structures corresponding to the late Golgi compartment, and which enters presumptive 
retrogade intra-Golgi vesicles whose fusion depends on two t-SNARE proteins Sed5p and Sftlp 
(Wooding and Pelham, Mol. Biol. Cell 9:2667-2680 (1998)). Its genetic interaction with Sed5p 

30 suggests that SFT2p is an additional membrane component involved in the docking or fusion 

process. In vivo experiments have shown that deletion of GOTlp or SFT2p alone does not affect 
cell growth, but repression of both of these proteins results in a significant accumulation of ER 
membrane, suggesting that the presence of either SFT2p or GOTlp is required for the maintenance 
of efficient ER-Golgi transport (Conchon et al., supra). It has also been shown that Gotlp normally 

35 facilitates Sed5p-dependant fusion events, while Sft2p performs a related function in the late Golgi 
(Conchon et al., supra). 
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The etiology of numerous human diseases and disorders can be attributed to defects in the 
trafficking of proteins to organelles or the cell surface. For example, defects in the trafficking of 
membrane-bound receptors and ion channels have been implicated in cystic fibrosis (cystic fibrosis 
transmembrane conductance regulator; CFTR), glucose-galactose malabsorption syndrome 
5 (Na.sup.+ /glucose cotransporter), hypercholesterolemia (low-density lipoprotein (LDL) receptor), 
and forms of diabetes mellitus (insulin receptor). Abnormal hormonal secretion has been linked to 
disorders including diabetes insipidus (vasopressin), hyper- and hypoglycemia (insulin, glucagon), 
Grave's disease and goiter (thyroid hormone), and Cushing's and Addison's diseases 
(adrenocorticotropic hormone; ACTH). 

10 Further, cancer cells secrete excessive amounts of hormones or other biologically active 

peptides. Disorders related to excessive secretion of biologically active peptides by tumor cells 
include: fasting hypoglycemia due to increased insulin secretion from insulinoma-islet cell tumors; 
hypertension due to increased epinephrine and norepinephrine secreted from pheochromocytomas 
of the adrenal medulla and sympathetic paraganglia; and carcinoid syndrome, which includes 

15 abdominal cramps, diarrhea, and valvular heart disease, caused by excessive amounts of vasoactive 
substances (serotonin, bradykinin, histamine, prostaglandins, and polypeptide hormones) secreted 
from intestinal tumors. Ectopic synthesis and secretion of biologically active peptides (peptides not 
expected from a tumor) includes ACTH and vasopressin in lung and pancreatic cancers; parathyroid 
hormone in lung and bladder cancers; calcitonin in lung and breast cancers; and thyroid-stimulating 

20 hormone in medullary thyroid carcinoma. 

It is believed that the protein of SEQ ID NO:419 or part thereof is an integral membrane 
protein of the SNARE-related family, and more presumably is the human homologue of the yeast 
SFT2p protein. Thus, the protein of the invention plays a role in the secretory and endocytic 
pathway of eukaryotic cells through fusion and transport of vesicles from the endoplasmic reticulum 

25 to late Golgi cisternae. Preferred polypeptides of the invention are polypeptides comprising the 
amino acids of SEQ ID NO:419 of the four transmembrane domains from positions 36 to 56, 66 to 
86, 98 to 118 and 122 to 142. Other preferred polypeptides of the invention are fragments of SEQ 
ID NO:419 having any of the biological activities described herein. 

In one embodiment, the invention relates to methods and compositions using the protein of 

30 the invention or part thereof as a new marker protein to selectively identify secretory and endocytic 
traffic, preferably in the endoplasmic reticulum and more preferably in the late Golgi cisternae. For 
example, the protein of the invention or part thereof may be detected using specific antibodies 
generated against the protein using any technique known to those skilled in the art. Such organelle- 
specific antibodies may then be used to identify cells with disrupted trafficking systems such as in 

35 differentiated tumor cells or to differentiate specific organelle types in a cell cross-section using 
immunochemistry. In addition, the protein of the invention can be used to specifically identify cells 
of the brain and/or placenta, tissues in which the protein is overexpressed. 
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Another embodiment of the present invention relates to methods of targeting heterologous 
compounds, such as polypeptides or polynucleotides, to the endoplasmic reticulum and 
preferentially to late Golgi vesicles by recombinantly or chemically fusing a fragment of the protein 
of the invention to the heterologous polypeptide or polynucleotide. Such fusion proteins may be 
5 engineered to contain a cleavage site located between a sequence encoding the protein of the 

invention and the heterologous protein sequence, so that the protein of the invention may be cleaved 
and purified away from the heterologous moiety. Preferred fragments of the protein that can be 
used in such applications are the four transmembrane domains or any other fragments of the protein 
of the invention, or part thereof, that may contain targeting signals for ER or Golgi organelles as 
10 defined in Conchon et al., supra; Wooding and Pelham, supra. Such heterologous compounds may 
be targeted to the secretory pathway to modulate ER-Golgi endocytic and secretory activities. In 
one embodiment, the protein of the invention can be used to screen peptide libraries for inhibitors of 
traffic activity, as detected by the accumulation of ER membranes or Golgi vesicles as described in 
Conchon et al., supra. 

15 In still another embodiment, the protein of the invention is used to diagnose, prevent and/or 

treat any of a number of disorders in which trafficking and/or the fusion machinery is affected, 
including, but not limited to, endocrine, secretory, inflammatory, and gastrointestinal disorders, 
such as cancer, cystic fibrosis (cystic fibrosis transmembrane conductance regulator; CFTR, as well 
as membrane-bound receptors and ion channels associated with CFTR), glucose-galactose 

20 malabsorption syndrome (Na.sup.+ /glucose cotransporter), hypercholesterolemia (low-density 
lipoprotein (LDL) receptor), and forms of diabetes mellitus (insulin receptor), abnormal hormonal 
secretion linked to disorders including diabetes insipidus (vasopressin), hyper- and hypoglycemia 
(insulin, glucagon), Grave's disease and goiter (thyroid hormone), Cushing's and Addison's diseases 
(adrenocorticotropic hormone; ACTH), disorders related to excessive secretion of biologically 

25 active peptides by tumor cells including fasting hypoglycemia due to increased insulin secretion 
from insulinoma-islet cell tumors, hypertension due to increased epinephrine and norepinephrine 
secreted from pheochromocytomas of the adrenal medulla and sympathetic paraganglia, carcinoid 
syndrome, which includes abdominal cramps, diarrhea, and valvular heart disease, caused by 
excessive amounts of vasoactive substances (serotonin, bradykinin, histamine, prostaglandins, and 

30 polypeptide hormones) secreted from intestinal tumors. Ectopic synthesis and secretion of 

biologically active peptides (peptides not expected from a tumor) includes ACTH and vasopressin 
in lung and pancreatic cancers; parathyroid hormone in lung and bladder cancers; calcitonin in lung 
and breast cancers; and thyroid-stimulating hormone in medullary thyroid carcinoma. 

An association between the level of expression and/or activity of the present protein with 

35 the presence or absence of any condition associated with abnormal vesicular trafficking and/or 

secretion, such as any of the above-listed disorders, can readily be assessed by detecting the level of 
expression or activity of the protein by, e.g., Northern blot, western blot, ELISA, or any standard in 
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vitro or in vivo assay for protein activity, and correlating the observed level or expression or activity 
with the presence or absence of the disorder. For those disorders found to be positively associated 
with the protein of the invention, a diagnostic or screening assay can be readily developed where the 
detection of an elevated level of protein or protein activity is indicative of the presence of the 
5 disease, or of a propensity to develop the disease. Further, any such diseases or conditions can be 
treated or prevented by inhibiting the expression or activity of the protein, for example by 
administering to a patient suffering from the disorder any inhibitor including, but not limited to, 
antibodies, antisense oligonucleotides, dominant negative forms of the protein, and small molecule 
inhibitors of protein expression or activity. Alternatively, disorders negatively associated with the 

10 protein of the invention can be diagnosed or screened for by detecting the level of the present 

protein or protein activity, where a decreased level of the protein or protein activity is indicative of 
the presence of the disease, or of a propensity to develop the disease. Such disorders that are 
negatively associated with the protein of the invention can be treated or prevented by increasing the 
level of the protein or protein activity, for example by administering to a patient any of a number of 

15 agents including, but not limited to, the protein itself, a polynucleotide encoding the protein, or a 
heterologous compound that enhances the expression or activity of the protein. 

Cancer cells secrete excessive amounts of hormones or other biologically active peptides. 
Therefore, in another embodiment, antagonists or inhibitors of the protein of the invention may be 
administered to a subject to treat or prevent cancers by inhibiting the traffic activity in transformed 

20 cells. Any type of cancer can be treated or prevented in this way, including, but not limited to, 
adenocarcinoma, sarcoma, melanoma, lymphoma, and leukemia. In preferred embodiments, the 
cancers include cancers of glands, tissues, and organs involved in secretion or absorption, such as 
prostate, pancreas, lung, tongue, brain, breast, bladder, adrenal gland, thyroid, liver, uterus, ovary, 
kidney, testes, and organs of the gastrointestinal tract including small intestine, colon, rectum, and 

25 stomach. In a particular aspect, antibodies which are specific for the protein of the invention may 
be used directly as an antagonist, or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express the protein of the invention. In addition, the 
elevated amount of the protein of the invention in tumor cells can readily be used to diagnose or 
screen for cancer, e.g. by measuring and comparing the level of the protein in a cell to that of a 

30 control cell using a specific antibody detected by FACS or using any other detection method known 
to those of skill in the art. 



Protein of SEP ID NO:297 ( 181-3 -3-0>C9-CS^ 

The protein of SEQ ID No:297, encoded by the cDNA of SEQ ID NO:56, is homologous to 
35 synaptogyrin 1 (Trembl ID: Q9UGZ4). The protein of the invention is highly expressed in the brain 
and fetal brain, fetal liver and the testis. 

271 



BNSDOCID: <WO 01 42451 A2_L> 



WO 01/42451 PCT/1B00/01938 
The protein of SEQ ID No:297 is a splice variant of synaptogyrin 1. The splicing of the 
cDNA of SEQ ID NO:56 is different for exon 3: whereas exon 3 of synaptogyrin 1 is 238 base-pair 
long, exon 3 of SEQ ID NO:56 is 345 base-pair long. This introduces a frameshift and a stop 
codon. Thus, the protein of SEQ ID NO:297 is identical to synaptogyrin 1 up to and including 
5 amino acid 122, the remaining 22 amino acids are entirely different. When compared to 

synaptogyrin 1, the protein of the invention presents the same N-terminal domain (which is highly 
conserved in all synaptogyrins) and 2 of the 4 transmembrane helixes. Preferred polypeptides of the 
invention are those that comprise amino acids 1 to 16, which make up the N terminal cytoplasmic 
domain of the protein and which are highly conserved among all members of the synaptogyrin 

10 family (Kedra D et al. - Hum Genet.- 1998, 1 03(2): 131-141). Other preferred polypeptides of the 
invention are those that comprise amino acids 25 to 45 and/or 68 to 88, which make up the two 
transmembrane alpha helixes. Thus it is believed that the protein of the invention is a member of 
the synaptogyrin family. 

Synaptogyrins are closely related to proteins of the synaptophysin family, both of which are 

15 involved in neurotransmission and more generally in exocytosis and vesicle trafficking. Members 
of the synaptogyrin family include synaptogyrin 1 (with splice variants la, lb and 1c), cellugyrin 
(synaptogyrin 2) and synaptogyrin 3. This family of proteins is also evolutionarily conserved, as 
homologues to human synaptogyrin 1 have been found in rats, mice, and C. elegans. Synaptogyrins 
and synaptophysin are among the most abundant vesicle components-together they account for 

20 more than 10% of the total vesicle membrane proteins. Although synaptogyrins do not appear to be 
required for exocytosis itself (apparently because synaptogyrins and synaptophysins have 
overlapping functions), they are essential for the normal regulation of exocytosis. 

The normal function and organization of eukaryotic cells is dependent on the transport of 
various vesicles that selectively shuttle membrane and cargo between distinct compartments of the 

25 secretory and endocytotic pathways. A number of key proteins involved in membrane targeting and 
exocytosis have been identified, and a fundamental set of interactions has been defined and placed 
into a model called the SNARE (Soluble N-ethylmaleimide-sensitive Attachment protein REceptor) 
hypothesis (Rotheman J - Nature - 1994, 372: p55-63). According to the SNARE hypothesis, 
vesicles dock to a target membrane through the interaction of complementary sets of vesicular (v- 

30 SNARE) and target (t-SNARE) membrane proteins. Our understanding of vesicle trafficking has, 
to a large extent, been facilitated by characterization of synaptic vesicles in neurons. In synaptic 
vesicle exocytosis, the vesicular protein synaptobrevin and synaptogyrin (also called Vesicle- 
Associated Membrane Protein; VAMP) are the v-SNARE, and the plasma membrane-associated 
protein SNAP-25 (Synaptosomal-Associated Protein of 25 kDa) and syntaxin 1 function as t- 

35 SNARE. Formation of the SNARE complex (or core complex) is followed by recruitment of the 
cytosolic proteins alpha, beta and gamma SNAP (Soluble N-ethylmaleimide-sensitive Attachment 
Protein) and NSF (N-ethylmaleimide«Sensitive Factor), which are required for membrane fusion. 
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In transfected PC 12 cells, synaptogyrin 1 and synaptophysin 1 are as effective as tetanus toxin light 
chain in inhibiting exocytosis (Sugita S. et ah - J Biol Chem. - 1999, 274(27): 18893-901), 
suggesting that these proteins are strong regulators of exocytosis. More recently, synaptogyrins 
have been found to have an essential function in synaptic plasticity (Janz R. et al - Neuron. - 1 999, 
5 24(3):687-700). 

The etiology of numerous human diseases and disorders can be attributed to defects in the 
trafficking of proteins to organelles or the cell surface. For example, defects in the trafficking of 
membrane-bound receptors and ion channels have been implicated in cystic fibrosis (cystic fibrosis 
transmembrane conductance regulator; CFTR), glucose-galactose malabsorption syndrome 

1 0 (Na.sup.+ /glucose cotransporter), hypercholesterolemia (low-density lipoprotein (LDL) receptor), 
and forms of diabetes mellitus (insulin receptor). Abnormal hormonal secretion has been linked to 
disorders including diabetes insipidus (vasopressin), hyper- and hypoglycemia (insulin, glucagon), 
Grave's disease and goiter (thyroid hormone), and Cushing's and Addison's diseases 
(adrenocorticotropic hormone; ACTH). 

15 Further, cancer cells secrete excessive amounts of hormones or other biologically active 

peptides. Disorders related to excessive secretion of biologically active peptides by tumor cells 
include: fasting hypoglycemia due to increased insulin secretion from insulinoma-islet cell tumors; 
hypertension due to increased epinephrine and norepinephrine secreted from pheochromocytomas 
of the adrenal medulla and sympathetic paraganglia; and carcinoid syndrome, which includes 

20 abdominal cramps, diarrhea, and valvular heart disease, caused by excessive amounts of vasoactive 
substances (serotonin, bradykinin, histamine, prostaglandins, and polypeptide hormones) secreted 
from intestinal tumors. Ectopic synthesis and secretion of biologically active peptides (peptides not 
expected from a tumor) includes ACTH and vasopressin in lung and pancreatic cancers; parathyroid 
hormone in lung and bladder cancers; calcitonin in lung and breast cancers; and thyroid-stimulating 

25 hormone in medullary thyroid carcinoma. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a new marker protein to selectively identify secretory and endocytic 
traffic, preferably in the endoplasmic reticulum and more preferably in the late Golgi cisternae. For 
example, the protein of the invention or part thereof may be detected using specific antibodies 

30 generated against the protein using any technique known to those skilled in the art. Such organelle- 
specific antibodies may then be used to identify cells with disrupted trafficking systems such as in 
differentiated tumor cells or to differentiate specific organelle types in a cell cross-section using 
immunochemistry. In addition, the protein of the invention can be used to specifically identify cells 
of the brain, fetal brain, fetal liver and the testis, tissues in which the protein is overexpressed. 

35 Another embodiment of the present invention relates to methods of targeting heterologous 

compounds, such as polypeptides or polynucleotides, to the components of the secretory machinery 
by recombinantly or chemically fusing a fragment of the protein of the invention to the heterologous 
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polypeptide or polynucleotide. Such fusion proteins may be engineered to contain a cleavage site 
located between a sequence encoding the protein of the invention and the heterologous protein 
sequence, so that the protein of the invention may be cleaved and purified away from the 
heterologous moiety. Such heterologous compounds may be targeted to the secretory pathway to 
5 modulate ER-Golgi endocytic and secretory activities. In one embodiment, the protein of the 
invention can be used to screen peptide libraries for inhibitors of traffic activity, as detected by the 
accumulation of ER membranes or Golgi vesicles as described in Conchon et al., supra. 

In still another embodiment, the protein of the invention is used to diagnose, prevent and/or 
treat any of a number of disorders in which trafficking and/or the fusion machinery is affected, 

10 including, but not limited to, endocrine, secretory, inflammatory, and gastrointestinal disorders, 
such as cancer, cystic fibrosis (cystic fibrosis transmembrane conductance regulator; CFTR, as well 
as membrane-bound receptors and ion channels associated with CFTR), glucose-galactose 
malabsorption syndrome (Na.sup.+ /glucose cotransporter), hypercholesterolemia (low-density 
lipoprotein (LDL) receptor), and forms of diabetes mellitus (insulin receptor), abnormal hormonal 

15 secretion linked to disorders including diabetes insipidus (vasopressin), hyper- and hypoglycemia 
(insulin, glucagon), Graved disease and goiter (thyroid hormone), Cushing's and Addison's diseases 
(adrenocorticotropic hormone; ACTH), disorders related to excessive secretion of biologically 
active peptides by tumor cells including fasting hypoglycemia due to increased insulin secretion 
from insulinoma-islet cell tumors, hypertension due to increased epinephrine and norepinephrine 

20 secreted from pheochromocytomas of the adrenal medulla and sympathetic paraganglia, carcinoid 
syndrome, which includes abdominal cramps, diarrhea, and valvular heart disease, caused by 
excessive amounts of vasoactive substances (serotonin, bradykinin, histamine, prostaglandins, and 
polypeptide hormones) secreted from intestinal tumors. Ectopic synthesis and secretion of 
biologically active peptides (peptides not expected from a tumor) includes ACTH and vasopressin 

25 in lung and pancreatic cancers; parathyroid hormone in lung and bladder cancers; calcitonin in lung 
and breast cancers; and thyroid-stimulating hormone in medullary thyroid carcinoma. 

An association between the level of expression and/or activity of the present protein with 
the presence or absence of any condition associated with abnormal vesicular trafficking and/or 
secretion, such as any of the above-listed disorders, can readily be assessed by detecting the level of 

30 expression or activity of the protein by, e.g., Northern blot, western blot, ELISA, or any standard in 
vitro or in vivo assay for protein activity, and correlating the observed level or expression or activity 
with the presence or absence of the disorder. For those disorders found to be positively associated 
with the protein of the invention, a diagnostic or screening assay can be readily developed where the 
detection of an elevated level of protein or protein activity is indicative of the presence of the 

35 disease, or of a propensity to develop the disease. Further, any such diseases or conditions can be 
treated or prevented by inhibiting the expression or activity of the protein, for example by 
administering to a patient suffering from the disorder any inhibitor including, but not limited to, 
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antibodies, antisense oligonucleotides, dominant negative forms of the protein, and small molecule 
inhibitors of protein expression or activity. Alternatively, disorders negatively associated with the 
protein of the invention can be diagnosed or screened for by detecting the level of the present 
protein or protein activity, where a decreased level of the protein or protein activity is indicative of 
5 the presence of the disease, or of a propensity to develop the disease. Such disorders that are 

negatively associated with the protein of the invention can be treated or prevented by increasing the 
level of the protein or protein activity, for example by administering to a patient any of a number of 
agents including, but not limited to, the protein itself, a polynucleotide encoding the protein, or a 
heterologous compound that enhances the expression or activity of the protein. 

10 Cancer cells secrete excessive amounts of hormones or other biologically active peptides. 

Therefore, in another embodiment, antagonists, inhibitors, or other modulators of the protein of the 
invention may be administered to a subject to treat or prevent cancers by inhibiting the traffic 
activity in transformed cells. Any type of cancer can be treated or prevented in this way, including, 
but not limited to, adenocarcinoma, sarcoma, melanoma, lymphoma, and leukemia. In preferred 

15 embodiments, the cancers include cancers of glands, tissues, and organs involved in secretion or 
absorption, such as prostate, pancreas, lung, tongue, brain, breast, bladder, adrenal gland, thyroid, 
liver, uterus, ovary, kidney, testes, and organs of the gastrointestinal tract including small intestine, 
colon, rectum, and stomach. In a particular aspect, antibodies which are specific for the protein of 
the invention may be used directly as an antagonist, or indirectly as a targeting or delivery 

20 mechanism for bringing a pharmaceutical agent to cells or tissues which express the protein of the 
invention. 

In addition, the present protein can be used to diagnose, treat, and prevent any neurological 
or psychiatric disorder or condition associated with abnormal neurotransmitter release, such as 
depression, which is associated with decreased serotonin secretion, or any neurological function, 
25 e.g. memory, which could be enhanced or otherwise modulated by altering the quantity, frequency, 
or any other property of neurotransmitter release in one or more cell types in the nervous system. 

Proteins of SEP ID NOs:247 and 246 (internal designations 105-03 1-2-0-D3-CS and 105-031-1-0- 
A2-CS) 

The protein of SEQ ID NOs:247 and 246, encoded by the cDNAs of SEQ ID NOs:6 and 5, 
30 respectively, are overexpressed in liver, pancreas, and prostate. The proteins of the invention are 
strongly homologous to the human membrane-bound protein PR0836 (GENSEQP accession 
number: W63687), and to the human secreted protein 7 (GENSEQP accession number: Y57941). 
The proteins of the invention also share homology with the chaperone-associated protein, SLSlp, 
found in yeast Yarrowia lipolytica (GENPEPT accession number Z50154), having 27% identity 
35 from amino-acids 68 to 340 of protein of SEQ ID No:247. In addition, the proteins of SEQ ED 
NOs:247 and 246 share homology with two Hsp70 family proteins, Hsp-binding protein 1 found in 
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mice (GENSEQP accession number: Z50154), and human species (GENPEPT accession number: 
AF093420), and Hsp-binding protein 2 found in human species (GENPEPT accession number: 
AF1 87859). 

The proteins of the invention are related to a yeast lumen protein of the endoplasmic 
5 reticulum, SLSlp. This protein acts in the preprotein translocation process, interacting directly with 
translocating polypeptides to facilitate their transfer and/or help their folding in the endoplasmic 
reticulum (Boisrame et al. J Biol Chem 1996; 271:1 1668-75). In addition, Slslp is believed to act 
as a cofactor of the chaperon protein Kar2 (Boisrame et ah J Biol Chem 1998; 273:30903-8 ; 
Kabani et al. Gene 2000; 241 :309-15). Thus, the proteins of the invention are presumed to have 

10 similar cellular functions as those of chaperones. Such functions include a number of cellular 
processes, such as protein folding, disassembly of oligomeric protein structures, regulation of 
apoptosis, protein degradation, protein translocation in the endoplasmic reticulum, and antigen- 
presentation (Bukau et al. Cell 1998; 92:351-66). Chaperones are also involved in a number of 
disorders, especially autoimmune diseases such as type 1 diabetes, rheumatoid arthritis, systemic 

15 lupus erythematosus, Sjogren syndrome, and mixed connective tissue disease (Feige et al. EXS 
1996; 77:359-73; Feili-Hariri et al. J Autoimmun 2000; 14:133-42). Chaperones are also involved 
in various disorders including tuberculosis and leprosy (Zugel et al. Clin Microbiol Rev 1999; 
12:19-39), neurogenerative disorders such as Alzheimer and Parkinson diseases (Yoo et al. J Neural 
Transm Suppl 1999; 57:315-22), and malignant disorders (Csermely et al. Pharmacol Ther 1998; 

20 79: 129-68). In addition, a growing body of evidence suggests the involvement of the Hsp60 

chaperone in the development of atherosclerosis (Xu et al. Circulation 2000; 102:14-20). Thus, the 
present proteins, which are presumed to be co-factors of a chaperon as summarized above, are 
believed to have analogous cellular functions and to be involved in similar pathological processes. 
In one embodiment, the present invention provide methods of using the present proteins to 

25 identify specific cell types in vitro and in vivo. For example, as chaperone proteins are often 
upregulated in response to cellular stress, the detection of cells expressing elevated levels of the 
proteins provides a tool for detecting cells under stress. As cellular stress has been implicated in a 
number of disorders, such as cardiovascular disorders, neurodegenerative disorders, and cancer, the 
ability to detect such stress thus provides a diagnostic or screening tool for such conditions. In 

30 addition, the present polypeptides and polynucleotides can be used to identify liver, pancreas, and 
prostate tissues, and cells derived from these tissues. The ability to specifically visualize such 
tissues and cells is useful for a number of applications, including to determine the origin or identity 
of, e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, 
e.g. the evaluation of histological slides. 

35 In addition, the present polypeptides and polynucleotides can be used to develop diagnostic 

and screening assays for diseases characterized by an abnormal level or activity of the protein of 
SEQ CD NOs:247 and 246. Such disorders include, but are not limited to, infectious diseases, 
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neurogenerative disorders such as Alzheimer's and Parkinson's diseases, schizophrenia, alopecia, 
aging, atherosclerosis, malignant disorders of various types, and autoimmune diseases including 
type 1 diabetes, rheumatoid arthritis, systemic lupus erythematosus, Sjogren syndrome, and mixed 
connective tissue disease. Such assays can be performed using any biological sample, such as 
5 serum or plasma. 

In still another embodiment, the proteins of the invention or part thereof can be used to 
prevent cells from undergoing apoptosis. Specifically, as chaperone proteins have been shown to 
protect cells from apoptosis, any method of increasing the level or activity of the present protein can 
be used to prevent cells from undergoing apoptosis, in vitro or in vivo. For example, a 

10 polynucleotide encoding a protein of SEQ ID NO:247 or 246, or any fragment or derivative thereof, 
can be introduced into cells, e.g. in a vector, wherein the protein is expressed in the cells. 
Alternatively, a protein of SEQ ID NO:247 or 246 itself can be administered to cells, preferably in a 
formulation that leads to the internalization of the protein by the cells. Also, any compound that 
increases the expression or activation of the proteins within the cells can be administered. 

15 Preventing cells from undergoing apoptosis can be used for any of a large number of purposes, 
including, but not limited to, to prevent the death of cells being grown in culture, to prevent in a 
patient the apoptosis associated with any of a number of disorders, or to prevent apoptosis in cells 
of a patient undergoing a treatment that increases the level of cellular stress, such as chemotherapy. 
In another embodiment, inhibiting the proteins of the invention can be used to induce 

20 apoptosis in undesired cells. Such inhibition can be accomplished in any of a number of ways, 

including, but not limited to, using antibodies, antisense sequences, dominant negative forms of the 
protein, or small molecule inhibitors of the expression or activity of the proteins. Such induction of 
apoptosis can be used to eliminate any undesired cells, for example cancer cells, in a patient. 
Preferably, such inhibitors are targeted specifically to the undesired cells in the patient. 

25 In another embodiment, various disorders can be treated, attenuated and/or prevented by a 

protein of SEQ ID NOs:247 or 246, or part thereof, or any other compound that can affect the level 
or activity of the proteins such as nucleic acids, antibodies, or chemical substances. In a preferred 
embodiment, proteins or other compounds directed to the proteins of the invention can be used to 
treat or prevent disorders in which the activity or level of the proteins of SEQ ED NO:247 or 246 is 

30 unbalanced. Such diseases include, but are not limited to, infectious diseases, neurogenerative 
disorders as Alzheimer and Parkinson diseases, schizophrenia, alopecia, aging, atherosclerosis, 
malignant disorders of various types, and autoimmune diseases including type 1 diabetes, 
rheumatoid arthritis, systemic lupus erythematosus, Sjogren syndrome, mixed connective tissue 
disease, malignant disorders, autoimmune and any other neurodegenerative disorder. In another 

35 embodiment, the proteins of SEQ ID NO:247 or 246 or part thereof can be used as vaccines for 

various disorders including, but not limited, to cancer (Wang et al. Immunol Invest 2000;29:131-7), 
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tuberculosis (Silva et al. Microbes Infect 1999;1:429-35), diabetes (Int Immunol 1 999; 1 1:957-66), 
and atherosclerosis (Xu et al. Arterioscler Thromb 1992;12:789-99). 



Protein of SEP ID NO:389 (internal designation 1 09-003-1 -0-G4-CS ) 

The protein of SEQ ID NO:389 is encoded by the cDNA of SEQ ID NO: 148. Accordingly, 
5 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:389 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 109-003-1 -0-G4-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO: 148 described throughout the present application also 
pertain to the human cDNA of clone 109-003-1 -0-G4-CS. The protein of SEQ ID NO:389 is highly 
10 homologous to two human proteins encoded by genes listed in Genbank under accession numbers 
AF143723 and AF1 12210, the disclosures of which are incorporated herein by reference in their 
entireties. 

The polypeptide encoded by Genbank accession numbers AF143723 and AF1 12210 belong 
to the Hsp70 protein family (even though one of them has erroneously been attributed to the related 

15 Hsp60 family). Many genes encoding "Hsps" (heat shock proteins) have been cloned and 

sequenced, including, for example, human hsp70 (GenBank Accession Nos. Ml 1717 and Ml 5432; 
see also Hunt and Morimoto, 1985, Proc. Natl. Acad. Sci. USA 82: 6455-6459, the disclosures of 
which are incorporated herein by reference in their entireties), human hsp90 (GenBank Accession 
No. XI 5 183; see also Yamazaki et al., 1989, Nucleic Acids Res. 17: 7108, the disclosures of which 

20 are incorporated herein by reference in their entireties), and human gp96 (GenBank Accession No. 
M33716; see also Maki et al., 1990, Proc. Natl. Acad. Sci. USA 87: 5658-5662, the disclosures of 
which are incorporated herein by reference in their entireties). 

The protein of SEQ ID NO: 3 89 and the two homologs mentioned above are actually closer 
to yeast members of the family than to the human Hsp70, which makes the corresponding genes 

25 previously unidentified human members of the family. Both the Pfam and Prosite Hsp70 signatures 
(respectively the "HSP70" Pfam model from amino acid position 3 to 509 and the PS01036 Prosite 
motif from position 332 to 346) are recognized within the protein of SEQ ID NO: 389. The protein 
of SEQ ID NO:389 differs from the protein encoded by AF1 12210 at amino-acid positions 282, 312 
and 326, and from the protein encoded by AF 143723 at amino acid position 15 and 326. 

30 Heat shock proteins are a family of molecular chaperone proteins which have long been 

known to play essential roles in a multitude of intra-and intercellular processes, including protein 
synthesis and folding, vesicular trafficking, and antigen processing and presentation. Hsps are 
among the most highly conserved proteins known, and carry out many of their regulatory activities 
via protein-protein interactions. Historically they were identified by induction under conditions of 

35 stress, during which they are now known to provide an essential action of preventing aggregation 
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and assisting refolding of misfolded proteins. The major stress proteins accumulate to very high 
levels in stressed cells but occur at low to moderate levels in cells that have not been stressed. 

Hsp70 is one member of the heat shock protein family. (Milner, C. M. and Campbell, R. D. 
Immunogenetics 32: 242-251 (1990); Genbank Accession No. M59828, the disclosures of which 
5 are incorporated herein by reference in their entireties). The 70kD heat shock protein is a highly 
conserved, ubiquitous protein involved in chaperoning proteins to various cellular organelles. 
Contrary to other members of the Hsp family, it is highly inducible in mammals. Although Hsp70 
is barely detectable at normal temperatures, it becomes one of the most actively synthesized 
proteins in the cell upon heat shock (Welch et al., 1985, J. Cell. Biol. 101:1 198-121 1, the disclosure 

10 of which is incorporated herein by reference in its entirety). In contrast, the Hsp90 and Hsp60 
proteins are abundant at normal temperatures in most, but not all, mammalian cells and are further 
induced by heat (Lai et al., 1984, Mol. Cell. Biol. 4:2802-10; van Bergen en Henegouwen et al., 
1987, Genes Dev., 1:525-31, the disclosures of which are incorporated herein by reference in their 
entireties). Furthermore the Hsp70 proteins act as monomers whereas the functionally related Hsp60 

15 proteins are associated in vivo within large double ring assemblies of nearly a million daltons. The 
various actions of the Hsps all rely basically on their ability to complex polypeptide segments, 
preferrably hydrophobic, and to stabilize them in an extended conformation in an ATP-dependent 
manner. The complexed polypeptides can be antigenic peptides (in which case the Hsps help 
directing them to the major histocompatibility complexes for presentation) or misfolded proteins 

20 which are facilitated to adopt the proper conformation by repeated cycles of binding to Hsps 
followed by release/refolding (see Bukau, B. and Horwich L., 1998, Cell 92: 351-366, the 
disclosure of which is incorporated herein by reference). 

On the basis of the above information, it is believed that the protein of SEQ ID NO:389 is a 
member of the human Hsp70 family. Accordingly, the protein of SEQ ID NO:389 may play a role 

25 in protein synthesis/folding, cellular trafficking , antigen processing, the cellular stress response and 
the immune response in immuno-competent cell types. Additional information regarding the 
protein of SEQ ID NO:389 may be obtained by performing a binding assay with a consensus Hsp70 
substrate using the methods described in Rudiger et al., 1997, EMBO J. 16, 1501-1507, the 
disclosure of which is incorporated herein by reference in its entirety. 

30 One embodiment of the present invention relates to methods of using the protein of SEQ ID 

NO:389 or fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 
200 consecutive amino acids thereof, or fragments having a desired biological activity as a 
stabilizing adjuvant to slow down protein degradation, boost the yields of recombinant proteins or 
regenerate denatured proteins. In such an embodiment, the protein of SEQ ID NO:389 of fragment 

35 thereof is mixed with a composition comprising the protein for which it is desired to slow down 
degradation, boost yield, or regenerate denatured proteins under conditions which facilitate the 
desired result. 
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For example, numerous commercial assay kits commonly used by those skilled int the arts 
of molecular biology and biochemistry depend on the biological properties of proteins (mostly 
enzymes) which can be very short-lived in vitro due to the low stability of those proteins. An 
example is described in Eur. Patent DE4 124286, the disclosure of which is incorporated herein by 
5 reference in its entirety, wherein the low intrinsic stability of test solutions used in optical tests is 
increased by addition of chaperone proteins, thus making the test more sensitive. 

The protein of SEQ ID NO:389 may also be used to increase the yield or activity of 
recombinant proteins. In recombinant DNA technology, a major unsolved problem is the solubility 
and biological activity of the recombinantly overexpressed protein in a host, especially a bacterial or 

10 yeast host. Many eukaryotic proteins, especially the secreted ones, require for correct folding a 
specific cellular machinery which is lacking in bacterial hosts such as E. coli or becomes 
insufficient in mammalian/yeast cells due to high expression of the protein. The ability of the 
protein of SEQ ED NO:389 or fragments thereof to ensure proper folding of recombinant proteins 
may be utilized as follows. The protein of SEQ ID NO:389, may be coexpressed with the 

1 5 recombinant protein in bacterial or eukaryotic hosts to cause the hosts to express the heterologous 
proteins or polypeptides in a form having increased solubility and/or biological activity. For 
example, the protein of SEQ ID NO:389 or fragments thereof may be used in the methods described 
in PCT application WO 93/25681, the disclosure of which is incorporated herein by reference in its 
entirety. Alternatively the protein of SEQ ID NO:389 or fragments thereof may be exogeneously 

20 added to the cell cultures as described in PCT application WO 00/08135, the disclosure of which is 
incorporated herein by reference in its entirety. Indeed PCT application WO 00/3 1113, the 
disclosure of which is incorporated herein by reference in its entirety, shows that when added 
exogenously to cells, Hsp70 is readily imported into both cytoplasmic and nuclear compartments. 
Preparation and purification of the protein of SEQ ID NO:389 or fragments thereof may be carried 

25 out as described in Patent US-6,007,821, the disclosure of which is incorporated herein by reference 
in its entirety. 

The protein of SEQ ID NO:389 or fragments thereof, may further be used to regenerate 
denatured proteins. Recombinantly expressed proteins with poor biologival activity are routinely 
denatured with a potent denaturing agent, such as guanidine hydrochloride, followed by refolding 

30 by dilution with a large amount of a diluent to reduce the concentration of the denaturing agent. 
However, this method often results in a poor refolding rate which may be significantly increased by 
addition of a cocktail of chaperone proteins in a fashion similar to that described for Hsp60in Eur. 
Patent EP0650975, the disclosure of which is incorporated herein by reference in its entirety. The 
advantage of using a cocktail of chaperone proteins is to accommodate differences in binding 

35 specificity of the Hsp different families and the different members within each family. For instance, 
vertebrate actin is efficiently folded by the chaperonine of the eukaryotic cytosol (Gao et al., 1992, 
Cell 69:1043-1050, the disclosure of which is incorporated herein by reference in its entirety) but 
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not at all by Hsp60 (Tian et al., 1995, Nature 375:250-253, the disclosure of which is incorporated 
herein by reference in its entirety). 

Another embodiment of the present invention relates to the use of the protein of SEQ ID 
NO: 3 89 or fragments thereof to deliver heterologous compounds (proteins, peptides, or DNA) to 
5 specific cellular compartments, preferably the cytoplasm and the nucleus. If desired, the protein of 
SEQ ED NO: 3 89 or a fragment thereof may be fused to the heterologous compound. For example, 
the protein of SEQ ID NO:389 or fragments thereof may be used to chaperone compounds into cells 
using the methods described in PCT application WO 00/31 1 13, the disclosure of which is 
incorporated herein by reference in its entirety. In the methods described in WO 00/31 13, Hsp70 

10 was used to deliver NF-KB, a key transcriptional regulator of inflammatory responses, into the 
nuclear compartment. It was shown that a fusion protein composed of a Cterminal Hsp70 peptide 
and amino acids 37-409 of the p50 subunit of NF-KB was directed into the nucleus of cells, could 
bind DNA specifically, and activated kappa lg expression and TNFa production. 

In one embodiment of the present invention, the protein of SEQ ID NO:389 or a fragment 

15 thereof may be used in human therapy as a modulator of immune response. Disease states which 
may be treated by Hsp70, fragments thereof, and/or Hsp70 complexes of the present invention 
include transplant rejection (see US5,891 ,653, the disclosure of which is incorporated herein by 
reference in its entirety) and autoimmune diseases, such as insulin dependent diabetes mellitus, 
rheumatoid arthritis, multiple sclerosis, juvenile diabetes, asthma, and inflammatory bowel disease, 

20 as well as inflammatory diseases, cancer, viral replication diseases and vascular diseases as 

described in the following patents, each of which is incorporated herein by reference in its entirety: 
US6,007,821 ; WO 00/31 1 13; WO 99/18801 (treatment of auto-immune diseases), US6,0 17,540; 
US6,0 17,544; AU3425899; WO 99/54464; US5,837,251; US5,830,464; WO 98/34642; WO 
98/34641; US5,750,1 19; WO 97/10001; WO 96/1041 1 (cancer treatment); DE19813760, 

25 DEI 981 3759 (both autoimmune disease and cancer). 

The protein of SEQ ID NO:389 or fragments thereof may also be used to treat or ameliorate 
autoimmune disease. In this embodiment, compositions of complexes of heat shock/stress proteins 
(including, but not limited to the protein of SEQ ID NO:389) are administered to an individual 
suffering from an autoimmune disease. The complexes may be comprised of the protein of SEQ ID 

30 NO:389 or fragments thereof alone or may include other heat shock/stress proteins. In one 
embodiment, the protein of SEQ ID NO:389 or a fragment thereof is bound noncovalently to 
antigenic molecules and administered to individuals suffering from autoimmune disease to suppress 
the autoimmune response. Alternatively, compositions comprising the protein of SEQ ID NO:389 
or fragments thereof in an un-complexed form (i.e., free of antigenic molecules) may also be 

35 administered to an individual suffering from autoimmune disease to suppress the immune response 
(see Patent US6,007,821, the disclosure of which is incorporated herein by reference in its entirety). 
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The ability of stress proteins to chaperone the antigenic peptides of the cells from which 
they are derived allows them to be used to isolate the antigenic peptides expressed in a tumor. In 
this embodiment of the present invention, complexes comprising the protein of SEQ ID NO:389 or 
fragments thereof and an antigenic peptide expressed by the tumor are isolated. The isolated 
5 complexes are administered back to the individual from which they were obtained in order to elicit 
an immune response against the tumor. Accordingly, this approach circumvents the necessity of 
isolating and characterizing specific tumor antigens and enables the skilled artisan to readily prepare 
immunogenic compositions effective against a tumor in an individual (see Patent US6,0 17,544, the 
disclosure of which is incorporated herein by reference in its entirety). 

10 The protein of SEQ ID NO:389 may also be used to diagnose bladder cancer. The segment 

of the protein of SEQ ED NO:389 extending between amino acid positions 1 through 187 is more 
than 99% identical to a polypeptide which is linked to bladder cancer. (See Eur. Patent 
DEI 98 18620, the disclosure of which is incorporated herein by reference in its entirety). The 187 . 
amino-acid long polypeptide described in DEI 98 18620 was identified as the partial product of the 

15 only gene for which expression was significantly altered in a bladder tumour compared to a healthy 
bladder. In another embodiment of the present invention, the protein of SEQ ID NO:389 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered intercellular 
communication or secretion. In such techniques, the level of the protein of SEQ ED NO:389 in an 
individual is measured using techniques such as those described herein. The level of the protein of 

20 SEQ ED NO:389 in the individual is compared to the level in normal individuals. An altered level 
of the protein of SEQ ID NO:389 relative to normal individuals suggests that the individual is 
suffering from bladder cancer. The level of the protein of SEQ ID NO:389 present in the individual 
may determined by contacting a sample from the individual with an antibody directed against the 
polypeptide of SEQ ID NO:389 . Alternatively, the level of the protein of SEQ ID NO:389 in the 

25 individual may be measured by determining the level of RNA encoding the protein of SEQ ID 

NO: 3 89 in the sample. RNA levels may be measured using nucleic acid arrays or using techniques 
such as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in 
the art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the 
nucleic acid sample prior to analysis. The level of RNA in the sample is compared to RNA levels 

30 in normal individuals to determine whether the individual is suffering from bladder cancer. 

Antibodies against the protein of the protein of SEQ ED NO:389 or nucleic acid probes 
complementary to the sequence encoding the protein of SEQ ED NO:389 may also be used as a 
prognosis of tumor recurrence in breast as described in Patent US Patent No.: 5,188,964, the 
disclosure of which is incorporated herein by reference in its entirety. As described in U.S. Patent 

35 No. 5,188,964, specific levels of the stress response proteins (including Hsp70) were identified, 
above which the probability of tumor recurrence is highly signficant. Accordingly, the levels of the 
protein of SEQ ED NO:389 or RNA encoding the protein of SEQ ED NO:389 may be determined 
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from in a sample from an individual who has experienced a breast tumor in the past. Protein or 
RNA levels may be measured as described herein. If the protein or RNA levels exceed the levels 
above which tumor occurrence is likely, an appropriate course of treatment may be initiated. 

In another embodiment of the present invention, the protein of SEQ ID NO:389 may be 
5 used to promote tissue repair and/or increase cell survival in stress conditions such as hypoxy, 
oxidative stress, genotoxic agents and more generally harmful conditions leading to programmed 
cell death. The beneficial effect is produced either by protecting the cell proteins from premature 
denaturation/degradation or by directly inhibiting a signal transduction pathway leading to 
programmed cell death (Gabai VL. et al., 1998, FEBS Lett. 438:1-4, the disclosure of which is 

10 incorporated herein by reference in its entirety). Those conditions include but are not limited to 
infarction, heart surgery, stroke, neurodegenerative diseases, epilepsy, trauma, atherosclerosis, 
restenosis after angioplasty, and nerve damage. For example, it is known that hypoxic stress is a 
signal that increases the amount of Hsp70 in cardiac tissue, whereupon Hsp70 helps cells survive by 
binding to partially denatured proteins and assisting in the refolding of these proteins into more 

15 stable native structures. Such assistance would be extremely important in providing protection to 
the heart during periods of hypoxia such as during an infarct or during surgery when blood flow to 
the heart may be temporarily halted. Several groups have also shown that overproduction of Hsp70 
leads to protection in several different models of nervous system injury (reviewed in Midori AY et 
al., 1999, Mol. Med. Today, 5:525-31, the disclosure of which is incorporated herein by reference in 

20 its entirety). Therapeutic methods for administering the protein of SEQ ID NO:389 or a fragment 
thereof include but are not limited to those disclosed in Patent WO 00/23093, the disclosure of 
which is incorporated herein by reference in its entirety. 

Accordingly, it may be desirable to increase or decrease the level of the protein of SEQ ID 
NO:389 in an individual having a condition resulting from an increased or decreased level of the 

25 protein. In such embodiments, the protein of SEQ ID NO:389 , or a fragment thereof, is 

administered to an individual in whom it is desired to increase or decrease any of the foregoing 
activities. The protein of SEQ ID NO:389 or fragment thereof may be administered directly to the 
individual or, alternatively, a nucleic acid encoding the protein of SEQ ID NO:389 or a fragment 
thereof may be administered to the individual. Alternatively, an agent which increases the activity 

30 of the protein of SEQ ID NO:389 may be administered to the individual. Such agents may be 
identified by contacting the protein of SEQ ID NO:389 or a cell or preparation containing the 
protein of SEQ ED NO:389 with a test agent and assaying whether the test agent increases the 
activity of the protein. For example, the test agent may be a chemical compound or a polypeptide 
or peptide. 

35 Alternatively, the activity of the protein of SEQ ID NO:389 may be decreased by 

administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:389 may be identified by contacting the protein of 
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SEQ ID NO:389 or a cell or preparation containing the protein of SEQ ID NO:389 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

5 Protein of SEP ID NO:250 (internal designation 105-053-4-0-E8-CS) 

The protein of SEQ ED NO:250 is encoded by the cDNA of SEQ ID NO:9. It will be 
appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:250 described 
throughout the present application also pertain to the polypeptide encoded by the human cDNA of 
clone 105-053-4-0-E8-CS. In addition, it will be appreciated that all characteristics and uses of the 

10 nucleic acid of SEQ ID NO:9 described throughout the present application also pertain to the human 
cDNA of clone 105-053-4-0-E8-CS. The protein of SEQ ID NO:250 is found in prostate and 
exhibits extensive homologies to stretches of pancreatic zymogen granule membrane protein GP2 
(Glycoprotein-2). In particular, the protein of SEQ ID NO:250 exhibits homologies to the GP2 
proteins of human (SWISS-PROT accession number P55259, the disclosure of which is 

1 5 incorporated herein by reference in its entirety), rat (SWISS-PROT accession number PI 921 8, the 
disclosure of which is incorporated herein by reference in its entirety) and dogs (SWISS-PROT 
accession number P25291, the disclosure of which is incorporated herein by reference in its 
entirety). In fact, the amino acid sequence of SEQ ID NO:250 is completely identical to those of 
human GP2 sequences except that the protein of SEQ ED NO:250 is missing amino acids 62 to 484 

20 from the human GP2 sequence. The protein of SEQ ID NO:250 contains two hydrophobic regions, 
namely the N-terminal signal peptide (amino acid residues 8-28) and the C-terminal transmembrane 
domain (amino acid residues 91-1 1 1). 

GP2 (Glycoprotein-2) is the major membrane glycoprotein of secretory zymogen granule 
(ZG) membranes within pancreatic acinar cells (Fukuoka et al. 1990 Nuc. Acids Res., 18:5900; 

25 Fukuoka et al. 1991 Proc. Natl. Acad. Sci., USA, 88:2898-2902; Fukuoka et al. 1992 Proc. Natl. 
Acad. Sci. USA, 89:1 189-1 193; Freedman, et al. 1993 Eur. J. Cell Biol. 61:229-238; Scheele et al. 
1993 Pancreas :139-149; Freedman et al. 1994 Annals N.Y. Acad. Sci. 713:199-206, the disclosures 
of which are incorporated herein by reference in their entireties). GP2 homologues are also widely 
distributed among diverse epithelial tissues known to possess regulated secretory processes, 

30 including parotid, submandibular gland, stomach, liver and lung (Fukuoka et al. 1992 Proc. Natl. 
Acad. Sci. USA, 89:1 189-1 193). 

In addition to ZG membranes, GP2 is also located in pancreatic acinar cells in rough 
endoplasmic reticulum, Golgi, trans-Golgi components, condensing vacuoles, apical plasma 
membranes (APM), basolateral plasma membranes (BPM), and within ZGs and acinar lumina 

35 (Scheele et al., 1994 Pancreas 9:139-149). GP2 is linked to the membrane of the ZG via a 

glycosylphosphatidyl inositol-anchor (GPI-anchor) (Fukuoka et al. 1991 Proc. Natl. Acad. Sci. 
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USA, 88:2898-2902; Lebel and Beattie 1988 Biochem. Biophys. Res. Comm. 254:1 189-93, the 
disclosures of which are incorporated herein by reference in their entireties) and forms complexes, 
usually tetrameric complexes, below a pH of about 6.5. 

During assembly of secretory granules within the trans-Golgi network (TGN), the low pH 
5 of the TGN causes formation of GP2 complexes. These complexes bind to proteoglycans (PG), 
forming a fibrillar GP2/PG meshwork on the lumenal surface of the ZG. The GP2/PG matrix may 
function in membrane sorting within the TGN, assembly of ZG membranes, inactivation of ZG 
membranes during granule storage, and regulation of ZG membrane trafficking at the apical plasma 
membrane. The GP2/PG matrix may also protect the lumenal aspect of the granule membrane from 

10 contact with secretory enzymes contained within the granules and facilitate the specific release of 
secretory enzymes during exocytosis at the apical plasma membrane. 

The enzymes and the acidic milieu contained in the ZG are released into the lumen of the 
pancreas through exocytosis by acinar cells. The pH at the apical plasma membrane of the acinar 
cells, and of the pancreatic lumen in general, is maintained at an essentially neutral or alkaline pH 

15 by the fluid and bicarbonate secreted by pancreatic ductal cells. The increased pH at the apical 
plasma membrane (relative to the acidic pH within the ZG) optimizes the conditions for enzymatic 
cleavage of the GPI anchor of GP2, resulting in release of GP2 and GP2/PG complexes from the 
apical membrane. (Scheele et al. (1994) Pancreas 9:139-149, the disclosure of which is incorporated 
herein by reference in its entirety). The form of GP2 produced by GPI-anchor cleavage is termed 

20 globular GP2 (gGP2). 

It is believed that the protein of SEQ ID NO:250 is a GP2 protein, and is thus likely 
involved in regulated membrane trafficking along apical secretory processes in a variety of 
epithelial cells. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:250, 
25 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

consecutive amino acids thereof, or fragments having a desired biological activity in the modulation 
of membrane sorting within the trans-Golgi network, assembly of zymogen granule membranes, 
inactivation of zymgogen granule membranes during granule storage, regulation of zymogen 
granule membrane trafficking at the apical plasma membrane, release of secretory enzymes during 
30 exocytosis at the apical plasma membrane. In such embodiments, the protein of SEQ ID NO:250, 
or a fragment thereof, is administered to an individual in whom it is desired to increase or decrease 
any of the foregoing activities. The protein of SEQ ID NO:250 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:250 or a fragment thereof may be administered to the individual. Alternatively, an agent 
35 which increases the activity of the protein of SEQ ID NO:250 may be administered to the 

individual. Such agents may be identified by contacting the protein of SEQ ID NO:250 or a cell or 
preparation containing the protein of SEQ ID NO:250 with a test agent and assaying whether the 
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test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:250 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:250 may be identified by contacting the protein of 
SEQ ID NO:250 or a cell or preparation containing the protein of SEQ ID NO:250 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify tissues, preferably pancreas 
and prostate, or to distinguish between two or more possible sources of a tissue sample on the basis 
of the level of the protein of SEQ ID NO:250 in the sample. For example, the protein of SEQ ID 
NO:250 or fragments thereof may be used to generate antibodies using any techniques known to 

15 those skilled in the art, including those described therein. Such tissue-specific antibodies may then 
be used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor 
tissue that has metastasized to foreign bodily sites, or to differentiate different tissue types in a 
tissue cross-section using immunochemistry. In such methods a tissue sample is contacted with the 
antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 

20 level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from pancreas or prostate or tissues other than pancrease or prostate to determine 
whether the test sample is from pancreas or prostate. Alternatively, the level of the protein of SEQ 
ID NO:250 in a test sample may be measured by determining the level of RNA encoding the protein 
of SEQ ID NO:250 in the test sample. RNA levels may be measured using nucleic acid arrays or 

25 using techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar 
to those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 
compared to RNA levels in control cells from pancreas or prostate or tissues other than pancreas or 
prostate to determine whether the test sample is from pancreas or prostate. 

30 In another embodiment, antibodies to the protein of the invention or part thereof may be 

used for detection, enrichment, or purification of membranes or zymogen granules using any 
techniques known to those skilled in the art. For example, an antibody against the protein of SEQ 
ID NO: 250 or a fragment thereof may be fixed to a solid support, such as a chromatograpy matrix. 
A prepartation containing membranes or zymogen granules is placed in contact with the antibody 

35 under conditions which facilitate binding to the antibody. The support is washed and then the 
membranes or zymogen granules are released from the support by contacting the support with 
agents which cause the membranes or zymogen granules to dissociate from the antibody. 
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In another embodiment of the present invention, the protein of SEQ ID NO:250 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered intercellular 
communication or secretion. In such techniques, the level of the protein of SEQ ID NO:250 in a 
patient is measured using techniques such as those described herein. The level of the protein of 
5 SEQ ID NO:250 in the patient is compared to the level in control individuals. An elevated level or 
decreased level of the protein of SEQ ID NO:250 relative to control individuals suggests that the 
patient is suffering from a defect in intercellular communication or secretion. 

In another embodiment, the protein of SEQ ID NO:250 or a fragment thereof is used to 
facilitate or decrease exocytosis. For example, the protein of SEQ ID NO:250 or fragment thereof 

10 may be used to increase or decrease the release of secretory enzymes within pancreatic acinar cells 
or prostatic cells. Accordingly, the protein of the invention or part thereof may be used to diagnose, 
treat and/or prevent disorders associated with abnormal membrane trafficking including but not 
limited to viral or other infections, traumatic tissue damage, and hereditary diseases such as 
pancreatitis or prostatitis, invasive carcinomas and lymphomas. In such methods, the protein of 

15 SEQ ED NO:250, a fragment of the protein of SEQ ID NO:250 , or an agent which increases or 
decreases the activity of the protein of SEQ ID NO:250 is administered to an individual using 
techniques such as those described herein. 

In another embodiment, the invention relates to methods of using the protein of SEQ ID 
NO:250 or a fragment thereof in the diagnosis of pancreatitis or prostatitis by detecting an elevation 

20 in the level of the protein of SEQ ID NO:250, in a sample of bodily fluid, such as human blood, 
serum, or urine. The protein may be detected using any method known to those skilled in the art, 
including those described herein. In some embodiments, the protein of SEQ ID NO: 250 or 
fragment thereof may be detected using the methods described in U.S. Patent Nos. 5436169 or 
5663315, the disclosures of which are incorporated herein by reference in their entireties. 

25 References : 

U.S. Patent Nos. 5,436,169; 5,663,315 

Nucleic Acids Research 18(9):5900, (1990) 

Proc. Natl. Acad. Sci. USA 88(7):2898-2902 (1991) 

Proc. Natl. Acad. Sci. USA 89:1 189-1 193 (1992) 
30 Eur. J. Cell Biol. 61:229-238 (1993 

Freedman et al., Annals N.Y. Acad. Sci. 713:199-206, 1994. 

Scheele et al., Pancreas 9(2): 139-149, 1994. 

Protein of Seq Id No: 274 (internal designation: 145-56-3-0-D5-CS) 

The protein of SEQ ID No: 274 encoded by the cDNA of SEQ ED No: 33 is homologous to 
35 the human RNA 3Merminal phosphate cyclase-like protein 1 (RclJ) (trEMBL accession number 
CAB 8981 1) which is abundant in the nucleolus. 

The RNA 3 '-terminal phosphate cyclase, an enzyme originally identified in extracts from 
human HeLa cells andXenopus oocyte nuclei, catalyzes the ATP-dependent conversion of the 3'- 
terminal phosphate group into a 2\3'-cyclic phosphodiester at the 3*-end of RNA, resulting in the 
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activation of the 3' end of RNA molecules. Database searches showed that genes encoding proteins 
similar to human and E.coli human RNA 3 '-terminal phosphate cyclase are conserved among 
eukarya, bacteria and archaea, arguing for an essential function of the enzyme in RNA metabolism 
(Genschik P. et al. - EMBO J - 1998, 16, p.2955-2967). Similarly analysis of the human RNA 3'- 
5 terminal phosphate cyclases and related proteins from other organisms, indicated that they can be 
divided into 2 subfamilies referred to as RNA 3'-terminal phosphate cyclases (Rtc) and RNA 3'- 
terminal phosphate cyclase-like protein (Rcl). These 2 subfamilies share several sequence elements, 
including a nearly universally conserved amino acid sequence RGxxPxGGGx@ (where x stands for 
any, and @ for hydrophobic amino acids), designated originally as the cyclase signature, which 

10 corresponds to the Prosite signature, although structurally slightly different these 2 subfamilies of 
proteins have the same function and are involved in RNA metabolism. The cyclase signature is 
present in the protein of the invention (positions 157 to 167). In addition, this protein also displays 
othere characteristic signatures of RNA 3'-terminal phosphate cyclase proteins (pfam signature 
from positions 1 to 368 and eMotif signatures from positions 12 to 44 and from positions 157 to 

15 168). 

3'-terminal phosphate cyclases (Rtc and Rcl) catalyze the conversion of 3 '-terminal 
phosphate to a 2', 3' -cyclic phosphodiester in a reaction dependent on ATP, other nucleoside 
triphosphates being much less active co-factors. With both enzymes, the cyclization of the 3'- 
phosphate at the 3 '-end of RNA occurs by a three-step mechanism as follows : (a) adenylation of 

20 the enzyme by ATP; (b) the enzyme acts on RNA-N3* P to produce RNA-N3'PP5 ! A; (c) a non 
catalytic nucleophilic attack by the adjacent 2'hydroxyl on the phosphorus in the diester linkage to 
produce the cyclic end product. 

RNA 3 '-terminal phosphate cyclase proteins are involved in RNA processing. It has been 
demonstrated that several eukaryotic and prokaryotic RNA ligases require 2',3'-cyIic phosphate 

25 RNA ends which suggests that the enzyme is involved in generation or maintenance of cyclic 

termini in RNA ligation substrates, known to be required by several RNA ligases in both eukaryotes 
and prokaryotes. These ligases include 2 tRNA-splicing ligases, and the prokaryotic RNA ligase of 
unknown function that joins RNA ends via atypical 2',5'-phospodiester (Arn E. et al. - RNA 
structure and Function - Cold spring Harbor Laboratory Press - 1998 p.695-726). The involvement 

30 of these ligases in nuclear pre-tRNA splicing is well documented (Zillmann et al. - -Mol Cell Biol - 
1 99 1 , 1 1 , p54 1 0-54 1 6)(Phizicky E. et al. - J Biol Chem, 1 992, 267, p4577-4582) but these enzymes 
might also function in the ligation of virusoids and viroids (Branch A. et al. - Science - 1982, 217, 
pi 147-1 149) (Kibertis et al. - EMBO J - 1 985, 4, p817-827) 

Alternatively, the cyclase could be responsible for producing cyclic phosphate 3'-ends 

35 identified in the spliceosomal U6 small nuclear RNA and some other small RNAs. Furthermore in 
yeast Rcl is associated to U3 small nucleolar RNP (U3 snoRNP) a central component of the 1 8S 
ribosomal RNA (rRNA) processing machinery in yeast and vertebrates (Billy E. et al. - EMBO J - 
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2000, 19, p21 15-2126). However it seems that Rcl are not a structural component of U3 snoRNP 
and its association with U3 snoRNP occurs, most probably, in large macromolecular complexes 
representing nascent ribosmes. In yeast, depletion or inactivation of Rcl causes a defect in 18S 
mRNA synthesis, which leads to a decreased levels of 40S ribosomal sub-units, resulting in an 
5 accumulation of free 60S ribosomes and a fall in the amount of polysomes. In Yeast 18S, 5.8S and 
25S rRNAs are derived from a long 35S precursor. This 35S pre-rRNA is normally cleaved at the 
AO site, yielding 33S pre-rRNA. 33S rRNA is then processed rapidly at sites Al and A2 to 
generate 20S pre-rRNA, which is further processed into mature 18S rRNA. Deletion or inactivation 
of Rcl leads to inhibition of processing at sites AO, Al and A2 (Billy E. et al. - EMBO J - 2000, 19, 
10 p21 15-2126). 

It is believed that the protein of SEQ ID No: 274 or part thereof is involved in RNA 
processing, probably as a RNA 3'-terminal phosphate cyclase. Preferred polypeptides of the 
invention are polypeptides comprising amino acids 157 to 167, 1 to 368, 12 to 44 and 157 to 168. 
Other preferred polypeptides of the invention are fragments of SEQ ID No: 274 having any of the 

15 biological activities described herein. Assays of cyclase activity can be carried out using the Norit 
method as described in the article by Filipowicz (Filipowicz W et al.- Methods Enzymol. - 1990, 
181, p.499-510), which disclosure is hereby incorporated by reference in its entity, or any other 
techniques known to those skilled in the art. 

Thus, an embodiment of the present invention relates to compositions and methods of using 

20 the protein of the invention or part thereof in in vitro RNA manipulation to isolate small nucleolar 
RNPs especially, but not limited to U3 snoRNP from biological samples, using 
immunoprecipitation techniques (Billy E. et al. - EMBO J - 2000, 19, p21 15-2126), which 
disclosure is hereby incorporated by reference in its entity, or any other techniques known to those 
skilled in the art. 

25 In another embodiment, the protein of the invention or part thereof is used to develop 

antagonists of the protein of the invention or part thereof in order to inhibit or decrease cellular 
proliferation. This can be explained by the fact that protein of the invention or part thereof is 
probably involved in rRNA maturation, thus the use of products that inhibit rRNA maturation 
prevents the formation of functional ribosomes, which leads to an inhibition of protein synthesis. 

30 Cells that are unable to synthesis proteins stop to grow and ultimately die due to the fact that they 
are unable to regenerate proteins. One preferred embodiment of the invention pertains to the use of 
the protein of the invention or part thereof to develop these antagonists, which are added to samples 
or materials as a "cocktail" in association with other antimicrobial substances to stop and/or prevent 
proliferation of undesired contaminants. For example the protein of the invention or part thereof 

35 may be used to inhibit the proliferation of undesired bacteria and or viruses in in vitro cultures. In 
another preferred embodiment of the invention the protein or part thereof could be used to develop 
antagonists that could be administered to patients suffering from viral and or bacterial infection 
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particularly viral infections by viruses such as HIV and HCV. This could for example be 
accomplished by targeting the antagonists to cells infected by the virus or directly to bacteria. Once 
inside these cells the antagonist will inhibit or at least decrease protein synthesis resulting in an 
inhibition or a decrease in bacterial and/or viral replication. In yet another preferred embodiment of 
5 the invention the protein or part thereof could be used to develop antagonists that could be 

administered to patients in order to inhibit abnormal and/or unregulated cellular proliferation found 
in diseases such as cancers, psoriasis, Systemic lupus erythematosus (SLE), arthritis, endometriosis, 
enteropathy in immunodeficiency virus infection, venous eczema (inducing connective tissue 
sclerosis in lipodermatosclerosis and causing the reduced reepithelialization tendency in venous 
10 ulcers), chronic irritant contact dermatitis (CICD), adult polycystic kidney disease (APKD), 
ichthyosis, cholesteatoma. 

Protein of SEP ID NOs:303 (internal designation number 187-31-0-0-F12-CS) and 275 (internal 
designation number 145-59-2-0-A7-CS) 

The 148-amino-acid long protein of SEQ ID NO:303, encoded by the cDNA of SEQ ID 

15 NO:62, found in fetal kidney and highly expressed in this organ, is homologous to the human RNA- 
associated protein HSCP250 (SPTREMBLNEW SPTREMBL SWISSPROT accession number 
AAF36170 and GENESEQP accession number Y84433). In addition, this protein displays 
significant homology to the ribosomal L27 protein of D. melanogaster (GENPEPT GENPEPTNEW 
accession number AE003576) and to the 50S ribosomal L27 protein of E.coli (SWISSPROT 

20 accession number P02427). The protein of SEQ ED NO:303 has a putative signal peptide, from 
amino acid position 13 to 27. According to the PFAM program, the protein of the invention also 
presents a ribosomal L27 protein signature in position 31 to 81. Amino acid residues in position 64 
to 78 are highly similar to the consensus pattern: G-X-[LrVM](2)-X-R-Q-R-G-X(5)-G, where X is 
any amino acid (the motif found in the protein of SEQ ID NO:303 is G-X-I-I-X-T-Q-R-H-X(5)-G). 

25 Potential phosphorylation sites exist in positions 32, 38, 47 (S amino residues), 60 (Y amino 

residue), 69 and 141 (T amino residues). One of them, the T residue in position 69, is embedded in 
the ribosomal L27 protein signature described above. 

The protein of SEQ ID NO:275, encoded by the cDNA of SEQ ID NO:34, is a 94-amino- 
acid long variant of the SEQ ED NO:303 protein. While the first 81 amino acid residues of protein 

30 of SEQ ID NO:275 are strictly homologous to the first 81 amino acid residues of protein of SEQ ID 
NO:303, the 13 subsequent amino-acids are different. In addition to the putative signal peptide 
(position 13 to 27), the ribosomal L27 protein signature (position 64 to 78), and phosphorylation 
sites (positions 32, 38, 47, 60 and 69), the protein of SEQ ED NO:275 also displays a candidate 
membrane-spanning segment in position 74 to 94. 

35 Ribosomal protein L27 is one of the proteins of the large ribosomal subunit. L27 belongs to 

a family of ribosomal proteins which, on the basis of sequence similarities, includes: eubacterial 
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L27, plant chloroplast L27 (nuclear-encoded), algal chloroplast L27 and yeast mitochondrial YmL2 
(gene MRPL2 or MRP7). Among the different ribosomal L27 proteins characterized so far, the one 
of E.coli is probably the best studied. Protein L27 is one of the smallest and the most basic 
polypeptides in E.coli ribosome. Techniques like the measurement of protein exposure by hot 
5 tritium bombardment have shown that L27 of the large subunit is well exposed on the surface of the 
E.coli 70S ribosome (Agafonov et a!., Proc. Natl. Acad. Sci. 94:12892-12897 (1997)). Chemical 
and UV-crosslinking studies have demonstrated that L27 is closely associated with domain V of the 
23 S rRNA, a region that comprises part of the peptidyl transferase center (Osswald et al. 9 Nucleic 
Acids Res. 18:6755-6760 (1990)). Direct evidence for the presence of L27 at the peptidyl 

10 transferase center was obtained through the use of derivatives of tRNA phc containing photoreactive 
azidonucleotides within the 3'-terminal ACCAqh sequence (Wower et ai, Proc. Natl. Acad. Sci. 
86:5232-5236 (1989)). Analysis of a mutant E.coli strain in which the rpmA gene, which encodes 
L27, was replaced by a Kanamycin marker, has suggested that L27 contributes to peptide bond 
formation by facilitating the proper placement of the acceptor end of the A-site tRNA at the 

15 peptidyl transferase center (Wower et ah, J. Biol. Chem. 273:19847-19852 (1998)). Further, recent 
studies conducted by Thiede and collaborators have precisely determined RNA-protein contact sites 
in the 50S ribosomal subunit of E.coli (Thiede et al., Biochem. J. 334:39-42 (1998)), showing that 
Lys-71 and Lys-74 of L27 interact with U-2334 of the 23S rRNA. 

It is believed that the proteins of SEQ ID NOs:303 and 275 are human RNA-associated 

20 proteins. Preferred polypeptides of SEQ ID NO: 303 are polypeptides comprising the amino acids 
from positions 1 3 to 27, 64 to 78 and amino acid residues in positions 32, 38, 47, 60, 69 and 141 . It 
is believed that the protein of SEQ ID NO:275 is a 94 amino acid long variant of the 148 amino acid 
residues protein of SEQ ID NO:303. Preferred polypeptides of SEQ ID NO:275 are polypeptides 
comprising the amino acids from positions 13 to 27, 64 to 78, 74 to 94 and amino acid residues in 

25 positions 32, 38, 47, 60, and 69. Other preferred polypeptides of the invention are fragments of 
SEQ ID NO:303 or 275 having any of the biological activities described herein. 

One embodiment of the present invention involves the use of the present proteins and 
nucleic acids to specifically identify cells from the kidney, especially from the fetal kidney. Such 
cells can be detected by virtue of their strong expression of the protein of the invention, and can 

30 thus be detected using any standard method for detecting protein expression or activity, including 
methods involving antibodies, specific nucleic acids, or any other detectable molecule that 
specifically binds to the polypeptides or polynucleotides of the invention. An ability to specifically 
detect kidney cells is useful, e.g. for determining the identity of tumor cells as well as for the 
identification of specific cell types and tissues for, e.g. histological analyses. 

35 In another embodiment of the present invention, the present proteins are used as a 

component of in vitro eukaryotic translation systems. Such systems represent a widely used tool for 
protein production with many academic and industrial applications. Similarly, inhibitors of the 
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protein of the invention, e.g. antibodies or dominant negative forms of the protein, can be used to 
inhibit in vitro translation systems, e.g. to specifically stop a translation reaction involving a 
eukaryotic cell extract. 

In another embodiment, the proteins of SEQ ID NO:303 or 275 can be used to bind to 
5 nucleic acids, preferably RNA, alone or in combination with other substances. For example, the 
proteins of the invention or part thereof can be added to a sample containing RNAs in optimum 
conditions for binding, and allowed to bind to RNAs. In a preferred such embodiment, the proteins 
of the invention or part thereof may be used to purify mRNAs, for example to specifically isolate 
RNA, e.g. from a specific cell type or from cells grown under particular conditions. Such RNAs 

10 could then be reverse transcribed and cloned, could be analyzed for relative expression analyses, 
etc. In addition, such methods may be used to specifically remove RNA from a sample, for 
example during the purification of DNA. To carry out any of these methods, the proteins of the 
invention or part thereof may be bound to a chromatographic support, either alone or in 
combination with other RNA binding proteins, to form an affinity chromatography column. A 

15 sample containing a mixture of nucleic acids to purify is then run through the column. 

Immobilizing the proteins of the invention or part thereof on a support is particularly advantageous 
for embodiments in which the method is to be practiced on a commercial scale. This 
immobilization facilitates the removal of RNAs from the batch of resin-coupled protein after 
binding, and allows subsequent re-use of the protein. Immobilization of the proteins of the 

20 invention or part thereof can be accomplished, for example, by inserting any matrix binding domain 
in the protein according to methods known to those skilled in the art. The resulting fusion product 
including the proteins of the invention or part thereof is then covalently, or by any other means, 
bound to a protein, carbohydrate or matrix (such as gold, "Sephadex" particles, polymeric surfaces). 
Still another embodiment of the invention relates to methods of preparing antibodies 

25 directed against the proteins of the invention or part thereof. Such antibodies may be used, e.g., in 
co-immunoprecipitation experiments to separate and purify RNAs associated with the proteins of 
the invention. To accomplish this, in a sample containing a mixture of nucleic acids, antibodies 
directed against the protein of the invention may be added in association with protein A or protein G 
sepharose beads. Immunoprecipitation conditions are well known to those skilled in the art. 

30 The invention further relates to methods and compositions used to modify the proteins of 

the invention. In a preferred embodiment, K amino-acids of the proteins of the invention are 
substituted for other basic amino-residues (R residues), as some of these K residues seem to be 
crucial for RNA interactions (Thiede et al., Biochem. J. 334:39-42 (1998)). Conversely, R residues 
of the proteins of the invention may be substituted for K residues. These substitutions are predicted 

35 to change the specificity and/or the affinity of the proteins of the invention for RNA molecules. 

Another preferred embodiment may be to perform post-translational modifications of the proteins of 
the invention, notably at the level of the putative phosphorylation sites described above in positions 
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32, 38, 47, 60, 69 and 141 of SEQ ED NO:303. By adding negative charges to the proteins of the 
invention, these phosphorylation sites may modulate the affinity of the protein for RNA molecules. 
Phosphorylation of T residue in position 69 is of great interest, as it is embedded in the ribosomal 
L27 protein signature. 

5 In another preferred embodiment, the proteins of the invention or part thereof may be used 

to visualize RNAs, when the polypeptides are linked to an appropriate fusion partner, or is detected 
by probing with an antibody. 

Another embodiment of the present invention relates to methods and compositions using the 
proteins of the invention, or part thereof, to associate specific mRNAs to the inner face of lipidic 

10 bilayers of liposomes in order to further introduce these mRNAs into the cytoplasm of eukaryotic 
cells. For example, as described above, the protein of the invention of SEQ ED NO:275 displays 
both a candidate membrane-spanning segment in position 74 to 94 (at its very carboxy-terminal 
part), and a ribosomal L27 protein signature in position 64 to 78. Moreover SEQ ID NO:275 is an 
RNA binding protein. Preferably, specific mRNAs are first associated with the protein of the 

15 invention and the RNA/protein complex formed in that way is then mixed with liposomes according 
to methods known to those skilled in the art. These liposomes are added to an in vitro culture of 
eukaryotic cells. In vivo, such a method might treat and/or prevent disorders linked to 
dysregulation of gene transcription such as cancer and other disorders relating to abnormal cellular 
differentiation, proliferation, or degeneration. 

20 In another embodiment, the present proteins and.nucleic acids can be used to modulate the 

rate of cell growth in vitro or in vivo. Studies in Drosophila have shown that a decrease in 
ribosome function results in a significant inhibition of cell growth. Accordingly, compounds that 
inhibits the expression or function of the proteins of the invention can be used to inhibit the growth 
rate of cells, and can thus be used, e.g. in the treatment or prevention of diseases or conditions 

25 associated with excessive cell growth, such as cancer or inflammatory conditions. Such compounds 
include, but are not limited to, antibodies, antisense molecules, dominant negative forms of the 
proteins, and any heterologous compounds that inhibit the expression or the activity of the proteins. 

Protein of SEP ID NO:269 (internal designation 1 16-1 15-2-0-F8-CS) 

30 The protein of SEQ ID NO:269, encoded by the cDNA SEQ ID NO:28, shows homology 

with the mink whale ribonuclease A (Emmens M., et al., Biochem. J. 157:317-323(1976)) a 
member of the pancreatic ribonuclease family. In addition, the protein of the invention exhibits 2 
membrane spanning segments, the first from amino acid positions 1-21, the second from amino acid 
positions 179-199. The cDNA SEQ ID NO:28 is composed of 3 exons. Exon 1 is encoded by 

35 nucleotides 1-225, exon 2 by nucleotides 226-288, and exon 3 by nucleotides 289-597. The protein 
of the invention is highly expressed in the testis. 
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Ribonucleases are proteins that catalyze the hydrolysis of phosphodiester bonds in RNA 
chains. Pancreatic ribonucleases are pyrimidine-specific ribonucleases present in high quantity in 
the pancreas of a number of mammalian taxa and of a few reptiles. In addition to their function in 
the hydrolysis of RNA, ribonucleases have evolved to support a variety of other physiological 
5 activities. Such activities include anti-parasite, anti-bacterial, anti-virus, and anti-neoplastic 
activities, as well as, in some cases, promoting neurotoxicity and angiogenesis. For example, 
bovine seminal ribonuclease is anti-neoplastic (Laceetti, P. et al. (1992) Cancer Res. 52: 4582- 
4586), and some frog ribonucleases display both anti-viral and anti-neoplastic activity (Youle, R. J. 
et al. (1994) Proc. Natl. Acad. Sci. USA 91 : 6012-6016; Mikulski, S. M. et al. (1990) J. Natl. 

10 Cancer Inst. 82: 151-152; and Wu, Y. -N. et al. (1993) J. Biol. Chem. 268: 10686-10693). In 

addition, angiogenin is a tRNA-specific ribonuclease which binds actin on the surface of endothelial 
cells for endocytosis and is then translocated to the nucleus where it promotes endothelial 
invasiveness required for blood vessel formation (Moroianu, J. and Riordan, J. F. (1994) Proc. Natl. 
Acad. Sci. USA 91: 1217-1221). Further, eosinophil-derived neurotoxin (EDN) and eosinophil 

15 cationic protein (ECP) are related ribonucleases which possess neurotoxicity (Beintema, J. J. et al. 
(1988) Biochemistry 27: 4530-4538; Ackerman, S. J. (1993) In Makino, S. and Fukuda, T., 
Eosinophils: Biological and Clinical Aspects. CRC Press, Boca Raton, Fla., pp 33-74). ECP also 
exhibits cytotoxic, an ti -parasitic, and anti-bacterial activities. Finally, an EDN-related ribonuclease, 
RNase k6, is expressed in normal human monocytes and neutrophils, suggesting a role for this 

20 ribonuclease in host defense (Rosenberg, H. F. and Dyer, K. D. (1996) Nuc. Acid. Res. 24: 3507- 
3513). 

It is believed that the protein of SEQ ID NO:269 is a ribonuclease, and is thus capable of 
hydrolyzing ribonucleic acids, and is involved in the a number of processes including defense 
against infection and neoplasia, as well as in neurotoxicity and angiogenesis. Preferred 
25 polypeptides of the invention are any fragments of SEQ ID NO:269 having any of the biological 
activities described herein. The ribonuclease activity of the protein of the invention or part thereof 
may be assayed using any assay known to those skilled in the art, including those described in US 
patent 5,866,119. 

In one embodiment, the present polynucleotides and polypeptides are used to specifically 
30 detect testis tissue and cells derived from the testis, as the present protein is overexpressed in this 
tissue. For example, the protein of the invention or part thereof may be used to synthesize specific 
antibodies using any technique known to those skilled in the art. Such tissue-specific antibodies 
may then be used to identify tissues of unknown origin, such as in forensic samples, differentiated 
tumor tissue that has metastasized to foreign bodily sites, etc., or to differentiate different tissue 
35 types in a tissue cross-section using immunochemistry. 

The present invention relates to methods and compositions using the protein of the 
invention or part thereof to hydrolyze one or several substrates, preferably nucleic acids, more 
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preferably RNA, alone or in combination with other substances. For example, the protein of the 
invention or part thereof is added to a sample containing a substrate(s) in conditions amenable to 
enzyme activity, and the protein thus catalyzes the hydrolysis of the substrate(s). 

In a preferred embodiment, the protein of the invention or part thereof may be used to 
5 remove contaminating RNA in a biological sample, alone or in combination with other nucleases. 
In a more preferred embodiment, the protein of the invention or part thereof is used to remove 
contaminating RNA from DNA preparations, to remove RNA templates prior to second strand 
synthesis and prior to analysis of in vitro translation products. In one such embodiment, the protein 
of the invention or part thereof is added to a biological sample as a "cocktail" along with other 

10 nucleases. The advantage of using a cocktail of hydrolytic enzymes is that one is able to hydrolyze 
a wide range of substrates without knowing the specificity of any of the enzymes, or even the 
identity of all of the substrates. Such cocktails of nucleases are commonly used in molecular 
biology assays, for example to remove unbound RNA in RNAse protection assays. Using a cocktail 
of hydrolytic enzymes also protects a sample from a wide range of future unknown RNA 

15 contaminants from a vast number of sources. For example, the protein of the invention or part 
thereof is added to samples where contaminating substrates are undesirable. Alternatively, the 
protein of the invention or part thereof may be bound to a chromatographic support, either alone or 
in combination with other hydrolytic enzymes, using techniques well known in the art, to form an 
affinity chromatography column. A sample containing the undesirable substrate is run through the 

20 column to remove the substrate. Immobilizing the protein of the invention or part thereof on a 
support is particularly advantageous for those embodiments in which the method is to be practiced 
on a commercial scale. This immobilization facilitates the removal of the enzyme from the batch of 
product and subsequent reuse of the enzyme. Immobilization of the protein of the invention or part 
thereof can be accomplished, for example, by inserting a cellulose-binding domain in the protein. 

25 One of skill in the art will understand that other methods of immobilization could also be used and 
are described in the available literature. Alternatively, the same methods may be used to identify 
new substrates. 

In another embodiment, the protein of the invention or part thereof may be used to 
decontaminate or disinfect samples infected by undesirable parasite, bacteria and/or viruses using 

30 any of the methods known to those skilled in the art including those described in Youle et al, 

(1994), supra; Mikulski et al (1990) supra, Wu et al (1993). In a preferred embodiment, the protein 
is used to eliminate RNA viruses from a sample or in a patient. 

In another embodiment, the present invention relates to compositions and methods using the 
protein of the invention or part thereof to selectively kill cells. The protein of the invention or part 

35 thereof is linked to a recognition moiety capable of binding to a chosen cell, such as lectins, 
receptors or antibodies, thereby generating cell-specific cytotoxic reagents as described in US 
Patent No. 5,955,073, the disclosure of which is herein incorporated in its entirety. 
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In another embodiment, the protein of the invention or part thereof is used in the diagnosis, 
prevention and/or treatment of neoplastic disorders. In one such embodiment, cancer can be treated 
or prevented in a patient by increasing the activity of the present protein in the patient, particularly 
within neoplastic or hyperplastic cells within the patient. For example, a polynucleotide encoding 
5 the protein of the invention or part thereof, a polynucleotide encoding the protein, or a compound 
that causes an increase in the expression or activity of the protein, can be administered to the 
patient, or to cells derived from the patient, in vivo or ex vivo. Preferably, the protein, 
polynucleotide, or compound is specifically targeted to the neoplastic or hyperplastic cells, for 
example by intratumoral injection of the molecule or by linking the molecule to a targeting moiety, 

10 such as a tumor cell-specific antibody. 

In another embodiment, cancer can be treated or prevented in a patient by inhibiting the 
expression or activity of the protein of the invention in endothelial cells of the patient, in particular 
within endothelial cells involved in angiogenesis. Such expression or activity can be inhibited in 
any of a number of ways, for example using antibodies, antisense sequences, ribozymes, dominant 

15 negative forms of the protein, as well as small molecule inhibitors of protein activity or expression. 

In another embodiment, the present polynucleotide and polypeptide sequences are used to 
diagnose cancer in a patient. In a typical such embodiment, a biological sample is obtained from a 
patient, and the level of the present polypeptides or polynucleotides is detected and compared with a 
control level, where a difference between the level observed in the patient and the control level 

20 indicates the presence of cancer in the patient. 

In another embodiment, the present protein is is inhibited within cells of a mammal in order 
to protect cells of the mammal against RNAse-associated neurotoxicity. In a typical such 
embodiment, the level of the protein is detected within the cells of the patient, where an elevated 
level of the protein, particularly within neurons, indicates a risk for neurotoxicity. The level of the 

25 expression or activity of the protein is subsequently inhibited using any standard method, such as 
antibodies, antisense molecules, ribozymes, dominant negative forms of the protein, or any other 
compounds that inhibit the expression or activity of the protein. Preferably, such inhibitors are 
specifically directed to the neurons of the mammal. 

Protein of SEP ID NO: 390 (internal designation 116-11 8-4-0-A8-CS ) 
30 The present inventors have provided a new gene and protein described in SEQ ID No 149 

and 390 respectively, belonging to the carbonic anhydrase (CA) family, more particularly the alpha- 

CA family. This novel alpha-CA related gene is located on the human chromosome 17q24 region. 
The Carbonic anhydrases (EC 4.2.1 .1) (CA) are zinc metal loenzymes which catalyze the 

reversible hydration of carbon dioxide. Nine different active Alpha-carbonic anhydrases (alpha- 
35 CA) that catalyze the hydration reaction have been found, as well as at least two alpha-CA-related 

enzymes. All known carbonic anhydrases from the animal kingdom are alpha-CAs, as opposed to 
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beta- and gamma-CAs, which are also zinc containing enzymes but are unrelated by sequence. The 
protein of SEQ ED No. 390 displays significant homology to the pfam Carbonic anhydrase domains 
amino acids between 20- to 59 of the protein, in particular the motif Gly-Ser-Glu-His in position 45 
to 48 of the protein which has been found to be highly conserved in a multi-alignment published by 
5 Lovejoy et al. 1998 (Genomics 54, 484-493). The chromosomal localization was found by BLAST 
alignment with a sequence mapping to chromosome 17 (genbank genomic fragment with accession 
number AC002090) and another alignment with a genomic fragment (accession number AF064854) 
which maps the gene in the 17q24 region. The polypeptide of SEQ ID No. 390 displays particularly 
high homology to a human protein called carbonic anhydrase-related protein 10 (genbank accession 

10 number AB036836, published directly in database by Adachi,K. and NishimoriJ). 

The human alpha-CAs contain a highly conserved catalytic site which comprises a zinc 
coordination polyhedron defining an active site located in a large cone-shaped cavity that extends 
almost to the center of the alpha-CA molecule. One site of the cavity is formed by hydrophobic 
residues, which the other side contains hydrophillic residues, including Thrl99 and Glul06 

15 (referring to CA II enzyme). The zinc ion is located at the bottom of this cleft, and tetrahedrally 
coordinated to the imidazoles of three histidine residues (His94, His96, His 1 19, referring to CA II 
enzyme) and to a water molecule called the 'zinc water' that ionizes to a hydroxide ion with a pK of 
about 7. (Sly et al., Ann. Rev. Biochem. 64:375-401 (1995). Studies have shown that the Zn-OH- 
/Thel99/Glul06 network is important in binding bicarbonate, sulfonamide inhibitors and many 

20 anionic inhibitors (Liljas et al., Eur. J. Biochem. 219:1-10). 
Improved alpha-CA inhibitors 

Of the human alpha-CAs, it has been found that the various isozymes have differing tissue 
distributions and intracellular localizations. Alpha-CA II for example, is expressed in the cytosol of 
some cell types in virtually every tissue or organ, while alpha-CA I is expressed in colon and 

25 erythrocytes, and alpha-CA IV is expressed on the apical surfaces of epithelial cells of some 

segments of the nephron, the apical plasma membrane in the lower gastrointestinal tract, and the 
plasma face of endothelial cells of certain capillary beds. The protein of SEQ ID NO: 390 encoded 
by the cDNA of SEQ ID NO: 149 has been found by the present inventors to be expressed in testes. 
The human alpha-CAs have been found to be involved in a range of important biological 

30 functions involving pH regulation, C02 and HCO3- transport, ion transport and water and 

electrolyte balance. Functions in which alpha-CAs are involved include H+ secretion, HCO3- 
reabsorption, HCO3- secretion, bone resorption, and production of aqueous humor, cerebrospinal 
fluid, gastric acid and pancreatic juice. Of particular medical interest, CA II has been found to be 
implicated in osteoporosis, as CA II defects have been found to be a cause of inherited osteoporosis, 

35 found along with renal tubular acidosis and brain calcification. 

CA activity can be determined by well known means. Assays used to characterize CA 
isozyme activity are provided for example in Khalifah, J. Biol. Chem. 246:2561-73 (1971); Chen et 
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al, Biochem 32: 7861-65 (1993); Tu et al., J. Biol. Chem. 258:8867-8871 (1986); and Jewel et al., 
Biochem. 30:1484-1490 (1991). 

Many different inhibitors of CA have been identified, and certain CA inhibitors have been 
developed as medicaments. CA inhibitors are currently a primary treatment for glaucoma, where 
5 inhibition of CA activity reduces intra-ocular pressure by inhibiting formation of aqueous humor. 
Approved CA inhibitors for glaucoma include Acetazolamide (Diamox®), Methazolamide 
(Neptazane®), Dorzolamide (Trusopt®) and Brinzolamide (Azopt®). 

Improved broad-acting CA inhibitors 

In certain treatment settings, there is a need for CA inhibitors capable of inhibiting the 

10 broad class of CA isozymes so as to inhibit CO2 hydration activity. Local (topical) use of CA 
inhibitors has been found advantageous over systemic application for glaucoma, allowing systemic 
side effects of CA inhibitors to be avoided. However, current CA inhibitors may have limited 
efficacy in terms of ability to completely inhibit total CA activity. A topical treatment, 
Dorzolamide (Trosopt®), for example, is an inhibitor of CA II, but only weakly inhibits CA VI. 

15 (Hoyng et al,. Drugs, 50(3): 41 1-434 (2000). Inhibitors of CA II may be unable to inhibit CA 1 or 
other CAs, which is thought to result in decreased drug efficacy because other CAs can compensate 
for loss of CA II activity (Sly and Hu, Ann. Rev. Biochem. 64:375-401 (1995)). In one example, 
CA I is five to six times as abundant as CA II in human erythrocytes, but has only about 15% of the 
activity. Thus, CA I contributes about 50% of the total CA activity (Dodgon et al., J. Appl. Physiol. 

20 64:1492-80 (1988). Moreover, CA I may have different inhibitor sensitivity profile from CA II, as 
CA I is less sensitive to sulfonamide inhibitors, for example. CA II and CA IV on the other hand, 
show significant resistance to inhibition with halide ions in comparison to CA I. (Sly et al, (1995), 
supra) Thus, a significant amount of residual CA activity in a cell or tissue of interest may be due 
to other CAs, including the polypeptide of SEQ ID No. 390. 

25 Thus, in one aspect, alpha-CA related nucleic acid and polypeptide may be useful for the 

identification of compounds capable of inhibiting the alpha-CA -catalyzed reversible hydration 
reaction. In one aspect, the method is carried out to identify or select CA inhibitors capable of 
inhibiting the activity of the polypeptide of SEQ ID No. 390. In other aspects, the method is carried 
out to identify or select CA inhibitors capable of broadly inhibiting the activity of a large number of 

30 CA enzymes. The nucleic acid and polypeptide sequences of the invention can be used in computer 
based drug design or for carrying out binding predictions with candidate CA inhibitors in view of 
the extensive structural information publicly available for CA enzymes. In preferred embodiments, 
the nucleic acid and polypeptide of the invention is used in drug screening assays. Assays may be 
cell based or non-cell based assays. In one embodiment, a nucleotide or polypeptide sequence of 

35 the invention is brought into contact with a candidate CA inhibitor (such as a CA II inhibitor), and 
binding of the candidate inhibitor to the polypeptide of the invention, or the activity of the 
polypeptide of the invention is detected. Activity of the polypeptide of the invention may be CA 
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activity, or any other suitable activity possessed by the polypeptide of the invention which may be 
inhibited by binding of the candidate substance. Assays for detecting hydration of carbon dioxide 
are well known, and referenced above. In preferred embodiments, a panel of CA isozymes including 
the polypeptide of the invention are screened against the candidate substance, including one or more 
5 enzymes selected from the group consisting of CA I, CA III, CA IV, CA VI, and a CA-RP 

including but not limited to CA-RP VII, CA-RP X and CA-RP XI. In preferred embodiments, a 
candidate CA inhibitor is selected according to its ability to broadly inhibit CA isozymes capable of 
catalyzing the hydration of carbon dioxide. Means to conduct such drug screening assays are well 
known in the art. In one embodiment drug binding is tested, using means described further herein 

10 as well as for example in International Patent Publication No. WO 00/585 10, the disclosure of 

which is incorporated herein by reference in its entirety, particularly the section titled "Methods for 
screening substances interacting with... polypeptides". Drug binding assays on a large panel of 
isozymes may also be carried out in high throughput format using commercially available binding 
assay systems (Graffinity Pharmaceutical Design GmbH, Heidelberg, Germany). 

15 The method according to the invention may generally be used to identify or select candidate 

compounds for the treatment of a disorder characterized by a disorder in pH regulation, C02 and 
HCO3- transport, ion transport or water and electrolyte balance. 
Improved selective CA inhibitors 

In other therapeutic strategies, CA inhibitors are delivered orally. However, systemic 

20 delivery may affect CA enzymes present in other tissues or organs leading to harmful side effects. 
It can be expected that CA II inhibitors may also partially or fully inhibit other CA isozymes, such 
as CA I, or a CA-related protein (CA-RP) such as CA-RP VIII, X, XI or CA RPTP-beta (Tashian et 
al., In "Carbonic Anhydrase: New Horizons", W.R. Chegwidden, N.D. Carter and Y.H. Edwards, 
Eds., Birkhauser, Basel); Adachi,K et al., Genbank accession number AB036836; Lovejoy et al. 

25 (1998); Peles et al. (1995)) or the CA polypeptide of SEQ ID No 390. CA-RPTP-beta, for example, 
has a CA domain having no or reduced CA activity but is thought to be involved in the ligand 
binding or protein complex participation in view of its binding of contactin by an extracellular 
region. (Peles et al., Cell 82: 251-260 (1995); Tashian et al., (1998), supra). The inhibition of an 
isozyme such as CA-RPTP-beta or the isozyme of the present invention by systemic treatment using 

30 a non-selective drug may result in harmful side effects. 

In one example, while oral CA inhibitors for the treatment of glaucoma (eg. acetazolamide) 
have been effective and without ocular adverse effects, they have shown important systemic effects, 
including parasthesia of the acra, fatigue, depression, renal stones and gastrointestinal complaints 
such as nausea and diarrhoea. (Hoyng et al., 2000, supra) Because CA inhibitors are typically used 

35 permanently (eg for glaucoma), or over long periods of time, avoiding side effects is particularly 
important. Selective CA inhibitors capable of inhibiting a CA isozyme of interest to a greater extent 
than another CA isozyme may thus offer improved means for the treatment of disease. 
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In one embodiment, the nucleotide and polypeptide sequences of the present invention may 
be used to design selective CA inhibitors. Studies have also shown that the different alpha-CA have 
different inhibitor binding properties (Sly et al., (1995), supra), suggesting that it may be possible to 
provide compounds that inhibit a CA isozyme of interest, such as CA II, while not binding to or 
5 inhibiting related enzymes such as the polypeptide of SEQ ID No. 390. The nucleic acid and 
polypeptide sequences of the invention can be used in computer based drug design or for carrying 
out binding predictions with candidate CA inhibitors in view of the extensive structural information 
publicly available for CA enzymes. In preferred embodiments, the nucleic acid and polypeptide of 
the invention is used in drug screening assays, including both cell based and non cell based assays. 

10 In one embodiment, a nucleotide or polypeptide sequence of the invention is brought into contact 
with a candidate CA inhibitor (such as a CA II inhibitor), and binding of the candidate inhibitor to 
the polypeptide of the invention, or the activity of the polypeptide of the invention is detected. 
Activity of the polypeptide of the invention may be CA activity, or any other suitable activity 
possessed by the polypeptide of the invention which may be inhibited by binding of the candidate 

15 substance. In preferred embodiments, a panel of CA isozymes including the polypeptide of the 
invention are screened against the candidate substance, including the polypeptide of SEQ ID No 39 
and one or more enzymes selected from the group consisting of CA I, CA 111, CA IV, CA VI, a 
CARP including but not limited to CARP VII, CARP X, CARP XI. In preferred embodiments, a 
candidate CA inhibitor is screened against one or more non-catalytic CA related proteins to 

20 eliminate undesired inhibition of these enzymes which may be involved in other important 

physiological functions. Means to conduct such drug screening assays are well known in the art. 

Increasing alpha-CA activity for the treatment of alpha-CA deficiency disease 

The polypeptide of the invention may also be used as a source of CA activity, such as for 

the treatment of disease. The defects in carbonic anhydrases are the cause of several diseases, 
25 including osteopetrosis (abnormally dense bone) renal tubular acidosis, cerebral calcification and 
mental retardation. Also, a carbonic anhydrase-related protein is described as being linked to cone- 
rod retinal distrophy (Bellinghan et al., 1998, Biochem. Biophys. Res. Comm.: 253, 364-367). 

In one aspect, the invention thus involves increasing CA activity by providing increased 
activity of the polypeptide of SEQ ID No. 390. Increased activity of the polypeptide of SEQ ED No 
30 390 can be provided by any suitable means, as further describer herein. Activity may be provided 
for example by introducing to a host cell or patient a vector containing a nucleotide sequence of 
SEQ ID No 149, treating said cell with a compound capable of increasing the expression of the 
polypeptide of the invention and/or treating a cell or patient directly with a polypeptide of SEQ ID 
No 390. In preferred embodiments, the polypeptide of the invention comprises at least one amino 
35 acid substitution, deletion or insertion. In one aspect, such amino acid changes are preferably in the 
catalytic site; preferably said amino acid changes involve the substitution, deletion or insertion of a 
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His residue and preferably said amino acid changes increase the CO2 hydration activity of the 

polypeptide of the invention. 
Metal ion biosensors 

In further aspects, metal ion biosensors can be designed based on the polypeptide of SEQ 
5 ID No 390. Determination of metal ion concentrations in complex media such as serum, cell 
cytoplasm as well as for example seawater are important analytical functions that require high 
degrees of sensitivity and selectivity. 

Biosensors may be particularly useful in detecting metal ion fluxes in and between cells. 
Such biosensors may exploit metal-binding ability of the polypeptide of the invention, as described 
10 by Thompson et al., who have developed such biosensors based on the CA enzyme (CA II). Such 
biosensors are useful in the detection of metal ion flux for example in the central nervous system. 
Zinc-containing neurons found throughout the mammalian cerebral cortex, striatum and amygdalar 
nuclei have been shown to release their zinc in a depolarization- and calcium-dependent fashion in 
vitro and in vivo. This zinc release has been suggested to act as a trans-synaptic neuromodulator . 
15 which has in turn been linked to excitotoxic neuronal cell death. CA based biosensors developed by 
Thomspon et al. showed that zinc is present and can be detected in extracellular medium from 
neurons. (Thompson et al, J. Neurosci Methods 96:35-45 (2000)). 

Biosensors based on CA have been shown to be extremely selective, detecting Cu at 
subpicomolar levels, which is of sensitivity that might be achieved with mass spectometric 
20 techniques. Sensors based on the CA II isozyme have been shown to detect Zn and Cu at picomolar 
levels, and Cd, Co and Ni at nanomolar levels. (Thompson et al., Anal. Biochem. 267:185-195 

(1999) ). CA based biosensors have also demonstrated selectivity over potential interferents in 
biological systems at mM levels in extracellular fluids, such as Mg and Ca. (Thompson et al. 

(2000) , supra). 

25 Biosensors based on the polypeptide of the invention are based on the high selectivity and 

sensitivity of CA isozymes for zinc. Because the binding of Zn in the active site of the enzyme 
affects the enzyme's ability to bind a CA inhibitor, it is possible to use a CA inhibitor that exhibits a 
detectable change upon binding to the polypeptide of the invention to detect the fraction of 
polypeptide bound to the inhibitor, and therefore bound to Zn. The fraction of polypeptide with 

30 bound Zn in turn is determined by the concentrations of free Zn and the polypeptide of the 
invention, and the dissociation constant for zinc. 

In one example, binding of the CA inhibitor to the polypeptide of the invention is detected 
by using a fluorescent inhibitor, whereby the inhibitor shows a detectable change in fluorescence 
emission wavelength of polarization upon binding to the polypeptide of the invention. In one 

35 example, a fluorescent sulfonamide is used, such as the fluorophore ABD-N (Thompson et al. 
(2000), supra). 

Engineered CA enzymes 
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CA isozymes have been shown to have differing levels of catalytic activity and efficiency. 
In preferred embodiments, particularly for treatments which involve providing the increased activity 
of the polypeptide of SEQ ID No 390 or for use in metal ion biosensors, the polypeptide of the 
invention may be modified for increased CO2 hydration and/or zinc binding. 
5 In particular, studies have been carried out characterizing residues important for maximal 

CA activity, allowing CA isozymes to be designed having desired levels of activity. Important 
structural elements in CA isozymes for zinc binding, CO2 hydration activity and stability are 
reviewed in Lindskog, Pharmacol . Ther. 74( 1 ): 1 -20 ( 1 997) and Sly ( 1 995), supra. In one example, 
studies of CA III showed that changing the Phel98 residue to a Leu 198 residue (as in CAII) 

10 resulted in a 25 fold increase in activity. (Chen et al., (1993), supra). Catalysis has also been greatly 
increased in CA II by replacing the Thr200 residue with His, as is normally found in CA 1 enzymes. 
Most dramatically, a CA-related protein (CA-RP) which in its native form was missing important 
residues at the catalytic site and had no detectable C02 hydration activity at all was rendered an 
active CA by only two point mutations. (Sjoblom et ah, FEBS Lett. 398: 322-325(1996)). 

1 5 Thus, in embodiments where the polypeptide of the invention is used to provide a source of 

CO2 hydration or for its zinc binding properties, it is advantageous to modify the polypeptide of the 
invention by introducing at least one amino acid substitution, deletion or insertion. In one aspect, 
such amino acid changes are preferably in the catalytic site; preferably said amino acid changes 
involve the substitution, deletion or insertion of a His residue and preferably said amino acid 

20 changes increase the C02 hydration activity of the polypeptide of the invention. Optimal amino 
acid changes can be determined by the skilled artisan, particularly in view of sequence comparisons 
which can be carried out with the many well-characterized CA isozymes. 

Protein of SEP ID NO:252 (internal designation 105-089-3-0-G10-CS) 

The protein of SEQ ID NO:252 is encoded by the cDNA of SEQ ID NO: 1 1 . Accordingly, 

25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:252 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 105-089-3-0-G10-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO: 1 1 described throughout the present application also pertain 
to the human cDNA of clone 105-089-3-0-G10-CS. It is over represented in fetal brain. 

30 The protein of SEQ ID NO:252 encoded by the cDNA of SEQ ID NO: 1 1 is distributed 

primarily in the prostate and salivary gland. The protein of SEQ ID NO:252 is homologous to 
sequences described in PCT publication WO9827205-A2 (which describes a protein that was 
isolated from a human adult salivary gland cDNA library), PCT publication W09839446-A2, PCT 
publication W09839446-A2. The disclosures of each of the preceding PCT publications is 

35 incorporated herein by reference in their entireties. 
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The protein of SEQ ID NO:252 is also homologous to a polypeptide described in PCT 
publication W09835229-A1, the disclosure of which is incorporated herein by reference in its 
entirety. Wo9835229-Al describes a peptide of 27 amino acid residues that corresponds to 23/27 of 
a portion of the protein of SEQ ID NO:252 (amino acid 20-46). This corresponds to 85% identity 
5 with conserved changes (3 out of 4) yielding a 96% homology. 

The protein described in WO 9835229 was identified in reflex tears that were collected 
from 12 non-contact lens wearing male and female humans. Reflex tears were stimulated by gently 
rubbing the nasal mucosa with a cotton wool tipped bud. Two different batches were collected 
from two different groups and examined by analytical and preparative 2-dimensional 

10 electrophoresis. After separation in the second dimension and transfer to PVDF membranes, 

identified protein spots (by 0.1% (w/v) Coomassie Blue) were loaded into a membrane-compatible 
Hewlett-Packard cartridge. Sequencing was conducted with a Model G 1005 A (Hewlett-Packard, 
CA) sequenator. One of the proteins identified migrated at 25 kDa and was revealed to have 5 
isoforms of different pi. Two of these were N-terminally sequenced and gave the sequence of the 

15 above peptide with a pi of 5.0 and 4.4. The different isoforms indicate that this protein undergoes 
post-translational modifications, including sialylation or acylation. The presence of these isoforms 
in different degrees could reflect the disease status of the individual. Accordingly, one embodiment 
of the present invention relates to the detection or diagnosis of disease by determining the activity 
or level of the protein of SEQ ID NO:252 or a polynucleotide encoding the protein of SEQ ID 

20 NO:252 in an individual. For example, detection of the secreted protein of SEQ ED NO:252 in an 
individual may be accomplished non-invasively by measuring protein levels in bodily fluids into 
which the protein is secreted, such as tears and saliva. Such methods may be empolyed both in 
humans and in animals. It is probable that after the signal peptide is cleaved, the protein of SEQ ID 
NO:252 is secreted into bodily fluids including tears and probably saliva. 

25 The protein of SEQ ID NO:252 can also be used for the screening of non-ocular diseases, 

by analyzing tears for marker proteins, particularly indicative of cancer and genetic disease. In 
addition, an altered chromatographic profile (e.g. 2D gel) of the isoforms of the protein of SEQ ID 
NO:252 may also indicate the disease state of an individual. For example, the levels of marker 
proteins in relation to the protein of SEQ ID NO:252 may be determined to evaluate whether the 

30 individual is suffering from a disease. Alternatively, tears may be analyzed for the levels of 
different isoforms of the protein of SEQ ED NO:252 to determine whether the pattern of such 
isoforms is indicative of disease. 

The protein of SEQ ID NO:252 or fragments thereof may also be used as a lubricant or 
cleansing agent for the eyes. This protein can be included in contact lenses washing and storage 

35 solutions. This protein can also be useful as an ingredient in eye washing solutions (e.g. eye drops) 
used for everyday redness or healing after surgical/laser intervention. For example, the protein may 
be used to reduce eye inflammation. Alternatively, anti-bacterial properties may be exploited by 
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including the protein of SEQ ID NO:252 or fragments thereof in solutions, creams or ointments for 
the eyes, as well as creams or ointments in general for external applications. 

Accordingly, the present invention includes the use of the protein of SEQ ED NO:252, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
5 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

ameliorate a condition in an individual. In such embodiments, the protein of SEQ DO NO:252, or a 
fragment thereof, is administered to an individual in whom it is desired to increase or decrease any 
of the activities of the protein of SEQ ID NO:252. The protein of SEQ ID NO:252 or fragment 
thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 

10 protein of SEQ ID NO:252 or a fragment thereof may be administered to the individual. 

Alternatively, an agent which increases the activity of the protein of SEQ ID NO:252 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 

1 5 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:252 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:252 may be identified by contacting the protein of 
SEQ ID NO:252 or a cell or preparation containing the protein of SEQ ID NO:252 with a test agent 

20 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

25 example, saliva or tears, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:252 in the sample. For example, the protein of SEQ 
ID NO:252 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 

30 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from saliva or tears or tissues other than saliva or tears to determine whether the test sample is from 

35 saliva or tears. Alternatively, the level of the protein of SEQ ID NO:252 in a test sample may be 
measured by determining the level of RNA encoding the protein of SEQ ID NO:252 in the test 
sample. RNA levels may be measured using nucleic acid arrays or using techniques such as in situ 
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hybridization, Northern blots, dot blots or other technques familiar to those skilled in the art. If 
desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic acid 
sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in control 
cells from saliva or tears or tissues other than saliva or tears to determine whether the test sample is 
5 from saliva or tears. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:252, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:252 or a fragment thereof may be fixed to a solid support, such as a 

10 chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO;252 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:252 or a 

15 fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:252. In such techniques, the level of the protein of SEQ ID NO:252 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of 252 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ED NO:252 which is indicative of disease. 

20 Protein of SEP ID NO:308 (internal designation 187-41-0-0-i21-CS) 

The protein of SEQ ID NO:308 is encoded by the cDNA of SEQ ID NO:67. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:308 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 87-41 -0-0-i21-CS. In addition, it will be appreciated that all characteristics and 

25 uses of the nucleic acid of SEQ ID NO:67 described throughout the present application also pertain 
to the human cDNA of clone 187-41-0-0-i21-CS. 

The protein of SEQ ID NO:308 is highly homologous to human secreted protein nf87_l 
from PCT publication WO 993 5252- A2 (the disclsoure of which is incorporated herein by reference 
in its entirety), to amino acids 26-129 of the human secreted protein SEQ ID NO:441 from PCT 

30 publication WO 9906548-A2 (the disclosure of which is incorporated herein by reference in its 
entirety), and to amino acids 26-1 14 of human secreted protein SEQ ED NO:439 from PCT 
publication WO 9906548-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the protein of the invention appears to be a polymorphic variant of nf87_l . Since 
most of the proteins with high homology to the sequence of the invention have longer 5'termini, it 

35 is conceivable that the protein of the invention is a truncated/spliced variant of these proteins. 
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The protein of SEQ ID NO:308 was identified among the cDNAs from a library constructed 
from brain. Tissue distribution analysis through a BLAST analysis of databases shows that mRNA 
encoding this protein was found primarily in kidney, liver, and cancerous prostate. 

The protein of SEQ ID NO:308 has chemical and structural homology to human interferon- 
5 inducible (EFI) protein isoforms p27 (63%), HIFI (50% identity), and to interferon-induced protein 
6-16 precursor (IFI-6-16, 36%). Furthermore, the protein of the invention has structural homology 
(40% identity) to the human erythropoietin (EPO) primary response gene, EPRG3pt from PCT 
publication WO 9906063-A2, the disclosure of which is incorporated herein by reference in its 
entirety. Thus, the present invention relates to nucleic acid and amino acid sequences of a novel IFI 
10 protein and to the use of these sequences in the diagnosis, study, prevention and treatment of 
disease. 

The protein of SEQ ID NO:308 comprises 105 amino acids. From the amino acid 
alignments and the hydrophobicity plots, it has a predicted signal peptide sequence spanning 
residues 31-43 and two predicted transmembrane domains spanning residues 17-37, and 48-68. 

15 Accordingly, one embodiment of the present invention is a polypeptide comprising the signal 
peptide and/or one or more of the transmembrane doamins. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 
cytokines. a-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
product of a single gene, a- and (3-IFNs are also known as type I IFNs. Type I IFNs are produced in 

20 a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
and by various cytokines and growth factors. y-IFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 

25 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, p- and y-IFNs. Other IFI 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 

30 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a-IFN, and renders the cancer cells more susceptible 
to immune rejection. The IFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 

35 New York NY pp 121 1-1214, the disclosure of which is incorporated herein by reference in its 
entirety). Type II IFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 
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The IFI gene known as 6-16 encodes an mRNA, which is highly induced by type 1 IFNs in a 
variety of human cells (Kelly JM et al (1986) EMBO J 5:1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much 
as 0.1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the 
5 absence of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a 
hydrophobic protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal 
peptide. Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively 
charged C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 

10 The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
breast tumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 
expression occurs only upon ct-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 

15 analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
tumors may produce high levels of, or have increased sensitivity to, type I IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 

20 in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IFI gene products may contribute to viral resistance. A hepatitis-C 
virus (HCV)-induced gene, 130-51, was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 

25 the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of the preceding paper suggest that 
HCV infection actively induces IFN expression, which in turn induces expression of IFI genes 
including 130-51. The IFI proteins synthesized in response to viral infections are known to inhibit 
viral functions such as penetration, uncoating, RNA or protein synthesis, assembly or release. The 

30 1 30-5 1 protein may inhibit one or more of these functions in HCV. A particular virus may be 

inhibited in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by 
IFI proteins differs among the virus families (Hardman JG, supra, p 121 1 , the disclosure of which is 
incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 

35 incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LBFESEQ™ 
database (Incyte Pharmaceuticas, Palo Alto, CA) shows that HIFI mRNA was found only in 
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neonatal kidney. The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-51, respectively. 

Based on the chemical and structural homology between the protein of SEQ ED NO.308 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
5 SEQ ID NO:308 is synthesized when interferons are produced in infections, inflammation, 
autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:308 or fragments thereof may be used in diagnosis and treatment of diseases 
such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 

10 systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 

15 sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e.g. leishmania). 

20 The protein of SEQ ID NO:308 or fragments thereof may also be used to treat conditions 

associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

Another embodiment of the present invention relates to the use of the protein of SEQ ID 
NO:308 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 

25 pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 

embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin lipopolysaccharide (LPS). The methods for using such compositions is described in 
Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

30 Furthermore, the protein of SEQ ID NO:308 or fragments thereof are useful as a reagent for 

analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:308 or fragments thereof may be used to identify 
specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 

35 SEQ ID NO:308 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL10 and tumor antigens, which may interact with the protein of 
the invention. 
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The protein of SEQ ED NO:308 or fragments thereof may also be included in 
pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO 308 or fragments thereof is used to inhibit and/or modulate the 
5 effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 
preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 

10 either intra peritoneally intravenously, subcutaneously or directly in the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnostic assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 

15 therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 
treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 

20 modify gene expression in tumor and pathogen-infected cells and to influence expression of 

cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 
to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 

25 upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 
and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 
DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 

30 region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 
as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO: 308 or fragments thereof are useful for the 
diagnosis of conditions and diseases associated with its expression and to quantify the protein of the 

35 invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 
Fab fragments produced by a Fab expression library. Neutralizing antibodies are especially 
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preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 
5 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme-linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 

10 immunoprecipitation, and chromatography. 

Under conditions of significant blood loss, EPO therapy, or both, iron-restricted 
erythropoiesis is evident. However, intravenous or oral iron therapy has substantial drawbacks. 
Moreover, traditional biochemical markers of storage iron in patients with anemia of chronic 
disease are unhelpful in the assessment of iron status (Lawrence T et al (2000) Blood 96:823-833, 

1 5 the disclosure of which is incorporated herein by reference in its entirety). As the protein of SEQ 
ID NO:308 bears homology to the human erythropoietin (EPO) primary response gene, EPRG3pt, it 
may be used to promote red blood cell formation or to monitor the value of safer intravenous iron 
preparations in patients with blood loss anemia, particularly those undergoing EPO therapy. 

The hydrophobic EFI protein of SEQ ID NO:308 or fragments thereof may be used to 

20 diagnose conditions associated with its induction. For example, the protein of SEQ ID NO:308 or 
fragments thereof may be useful in the diagnosis and treatment of tumors, viral infections, 
inflammation, or conditions associated with impaired immunity, anemia of chronic blood loss or 
chronic disease, hemochromatosis, and EPO therapy. Furthermore, this protein may be used for 
investigating the control of gene expression by IFNs and other cytokines, as well as hormones and 

25 growth factors, in normal and diseased cells. 

The protein of SEQ ID NO:308 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation, anemia, iron-overload and tumor models, by 
injecting the protein either intra peritoneally intravenously, subcutaneously or directly in the 
diseased tissue. 

30 In addition, the protein of SEQ ID NO:308 is structurally related to other proteins having 

homology and/or structural similarity with human p27 (Rasmussen, U.B., et al., 1993, Cancer 
Research 53:4096-4101, the disclosure of which is incorporated herein by reference). Accordingly, 
the protein of brain, fetal brain, kidney, fetal kidney, or colon may be used to regulate the 
proliferation of EPO-dependent cells or the growth and development of erythroid and other 

35 hematopoietic lineages. 

The protein of SEQ ID NO:308 or fragments thereof, or polynucleotides encoding the 
protein of SEQ ID NO: 308 or fragments thereof, may be used to treat or ameliorate anemia of 
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chronic disease and chronic renal failure, polycythemia, cancer, AIDS, drug- and phlebotomy- 
induced anemias, hemochromatosis, erythropoiesis mediated by EPO therapy, and other conditions 
associated with altered activity or levels of the protein of SEQ ID NO:308. 

In another embodiment, the present invention relates to methods for identifying agonists 
5 and antagonists/inhibitors using the protein of SEQ ID NO:308 or fragments thereof, and treating 
conditions with the identified compounds. In a still further aspect, the invention relates to 
diagnostic assays for detecting diseases associated with inappropriate levels or activity of the 
protein of SEQ ED NO:308. In still another embodiment of the invention relates to the use of the 
protein SEQ ID NO:308, fragments therof or the DNA encoding the protein of SEQ ID NO:308 or 

10 fragments thereof to monitor the value of iron therapy in patients undergoing EPO therapy, or 
experiencing blood loss, or both. 

The DNA encoding the protein of SEQ ID NO:308 or fragments thereof is useful in 
diagnostic assays for conditions/diseases associated with abnormal expression of the protein of SEQ 
ID NO:308. The diagnostic assay is useful to distinguish between absence, presence, and excess 

15 expression of the protein of the invention and to monitor regulation of levels of the protein of the 
invention during therapeutic intervention. The DNA may also be incorporated into effective 
eukaryotic expression vectors and directly targeted to a specific tissue, organ, or cell population for 
use in gene therapy to treat the above mentioned conditions, including tumors and/or to correct 
disease- or genetic-induced defects in any of the above mentioned proteins including the protein of 

20 the invention. The DNA may also be used to design antisense sequences and ribozymes, which can 
be administered to modify gene expression in tumor and pathogen-infected cells and to influence 
expression of cytokines, hormones and growth factors. In vivo delivery of genetic constructs into 
subjects can be developed to the point of targeting specific cell types, such as tumor where 
expression of the protein of the invention may be affected or is modulating the expression and/or 

25 activity of other proteins such as cytokines, growth factors, their receptors and/or tumor antigens. It 
is also useful to detect unknown upstream sequences (e. g. promoters and regulatory elements) by 
standard techniques and for research into the control of gene expression by interferons and other 
cytokines, as well as growth and transcription factors in normal and diseased cells. Hybridization 
probes are useful to detect DNA encoding the protein of the invention (or closely related molecules) 

30 in biological samples, and for mapping the naturally occurring genomic sequence to a particular 
chromosome/chromosome region. The DNA may be used to generate and/or treat in vivo animal 
models of disease, including susceptibility or resistance to infection, tumors, autoimmune 
conditions, anemia and iron-overload, as well as tumor therapy, based on vaccine, knock-out and 
transgene technologies. 

35 Antibodies against the protein of SEQ ID NO:308 are useful for the diagnosis of conditions 

and disease associated with its expression and to quantify the protein of the invention (e. g. in 
assays to monitor patients during therapeutic intervention). Antibodies specific for the protein may 
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include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab fragments 
produced by a Fab expression library. Neutralizing antibodies are especially preferred for 
diagnostics and therapeutics. Diagnostic assays for the protein of SEQ ID NO:308 include methods 
utilizing the antibody and a label to detect the protein of the invention in human body fluids or 

5 extracts of cells or tissues. 

The protein of SEQ ID NO:308 and its catalytic or immunogenic fragments or oligopeptides 
thereof, can be used for screening therapeutic compounds in any variety of drug screening 
techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 

10 (PCR), RT-PCR, RNAse protection, Northern blotting, enzyme-linked immunosorbent asay 
(EL1SA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:308, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

15 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 
granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 

20 syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hodgkins), sarcomas, melanomas, adenomas, carcinomas of solid 
tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 

25 genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular HCV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

In such embodiments, the protein of SEQ ID NO:308 , or a fragment thereof, is 

30 administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:308. The protein of SEQ ID NO:308 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ED NO:308 or a fragment thereof may be administered to the individual. Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO:308 may be administered to the 

35 individual. Such agents may be identified by contacting the protein of SEQ ID NO:308 or a cell or 
preparation containing the protein of SEQ ED NO:308 with a test agent and assaying whether the 
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test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:308 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:308 may be identified by contacting the protein of 
SEQ ED NO:308 or a cell or preparation containing the protein of SEQ ID NO:308 with a test 
agent and assaying whether the test agent decreases the activity of the protein. For example, the 
agent may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as 
an antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, kidney, liver, or cancerous prostate, or to distinguish between two or more possible 
sources of a sample on the basis of the level of the protein of SEQ ID NO:308 in the sample. For 
example, the protein of SEQ ID NO:308 or fragments thereof may be used to generate antibodies 

15 using any techniques known to those skilled in the art, including those described therein. Such 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. In such methods a sample is 
contacted with the antibody, which may be detectably labeled, under conditions which facilitate 

20 antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells frombrain, kidney, liver, or cancerous prostate or tissues other than 
brain, kidney, liver, or cancerous prostate to determine whether the test sample is from brain, 
kidney, liver, or cancerous prostate. Alternatively, the level of the protein of SEQ ID NO:308 in a 
test sample may be measured by determining the level of RNA encoding the protein of SEQ ID 

25 NO:308 in the test sample. RNA levels may be measured using nucleic acid arrays or using 
techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar to 
those skilled in the art. If desired, an amplification reaction, such as a PCR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 
compared to RNA levels in control cells from brain, kidney, liver, or cancerous prostate or tissues 

30 other than brain, kidney, liver, or cancerous prostate to determine whether the test sample is from 
brain, kidney, liver, or cancerous prostate. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:308, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ED NO:308 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:308 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:308 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:308. In such techniques, the level of the protein of SEQ ED NO:308 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO:308 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:308 which is associated 
with disease. 

10 Protein of SEP ID NOs:289 and 307 (internal designations 175-l-3-0-E5-CS.cor and 187-39-0-0- 
k!2-CS) 

The protein of SEQ ID NO:289 is encoded by the cDNA of SEQ ID NO:48. Accordingly, 

it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:289 

described throughout the present application also pertain to the polypeptide encoded by the human 
15 cDNA of clone 175-1-3-0-E5-CS. In addition, it will be appreciated that all characteristics and uses 

of the nucleic acid of SEQ ID NO:48 described throughout the present application also pertain to 

the human cDNA of clone 175-1-3-0-E5-CS. 

The protein of the invention consists of 130 amino acids. From the amino acid alignments 

and the hydrophobicity plots, it has a predicted signal peptide sequence spanning residues 8-20 and 
20 four predicted transmembrane domains spanning residues 2-24, 42-61, 70-90 and 99-1 19. 

Accordingly, some embodiments of the present invention relate to polypeptides comprising the 

signal peptide and/or one or more of the transmembrane domains. 

The protein of SEQ ED NO:289 encoded by the cDNA of SEQ ID NO:48 is homologous to 

SEQ ID NO: 4199 from EP 1 03340 1-A2 (the disclosure of which is incorporated herein by 
25 reference in its entirety), a human secreted protein. Another protein, SEQ ID NO:307, encoded by 

the cDNA of SEQ ID NO:66, is a polymorphic variant of the protein of SEQ ID NO:289, and shares 

all of the herein-described functions and uses. 

The present invention relates to a novel protein identified among the cDNAs from a library 

constructed from salivary gland, and to the use of the nucleic acid and amino acid sequences 
30 disclosed herein in the study, diagnosis, prevention, and treatment of disease. Tissue distribution 

analysis predicted by BLAST on databases shows that mRNA encoding this protein was found 

primarily in brain and fetal brain, with lower amounts in kidney, fetal kidney and colon. 

Interferons (IFNs) are a part of the group of intercellular messenger proteins known as 

cytokines. a-IFN is the product of a multigene family of at least 16 members, whereas b-IFN is the 
35 product of a single gene, a- and (3-IFNs are also known as type I IFNs. Type I IFNs are produced in 

a variety of cells types. Biosynthesis of type I IFNs is stimulated by viruses and other pathogens, 
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and by various cytokines and growth factors. y-DFN, also known as type II IFN, is produced in T- 
cells and natural killer cells. Antigens to which the organism has been sensitized stimulate 
biosynthesis of type II IFN. Both a- and y-IFNs are immunomodulators and anti-inflammatory 
agents, activating macrophages, T-cells and natural killer cells. 
5 IFNs are part of the body's natural defense to viruses and tumors. They exert these defenses 

by affecting the function of the immune system and by direct action on pathogens and tumor cells. 
IFNs mediate these multiple effects in part by inducing the synthesis of many cellular proteins. 
Some interferon-inducible (IFI) genes are induced equally well by a-, P- and y-JFNs. Other IFI 
genes are preferentially induced by the type I or by the type II IFNs. The various proteins produced 

10 by IFI genes possess antitumor, antiviral and immunomodulatory functions. The expression of 
tumor antigens in cancer cells is increased by a-IFN, and renders the cancer cells more susceptible 
to immune rejection. The IFI proteins synthesized in response to viral infections are known to 
inhibit viral functions such as cell penetration, uncoating, RNA and protein synthesis, assembly and 
release (Hardman JG et al 25 (1996) The Pharmacological Basis of Therapeutics, McGraw-Hill, 

15 New York NY pp 121 1-1214, the disclosure of which is incorporated herein by reference in its 
entirety). Type II IFN stimulates expression of major histocompatibility complex (MHC) proteins 
and is thus used in immune response enhancement. 

The protein of SEQ ID NO:289 is a small hydrophobic protein having chemical and 
structural homology to human interferon-inducible (IFI) protein iso forms 6-16 (97% identity), HIFI 

20 (44%), and p27 (33%), as well as 130-5 1, the chimpanzee homolog of 6-16 - (97%). Thus, the 
protein of SEQ ED NO:289 and the nucleic acid encoding it are polymorphic variants of 6-1 6 or the 
gene encoding 6-16. The protein of SEQ ID NO:289, fragments thereof, or nucleic acids encoding 
the protein of SEQ ID NO:289 or fragments thereof may be used in the diagnosis, study, prevention 
and treatment of disease as described below. 

25 The IFI gene known as 6-1 6 encodes an mRNA, which is highly induced by type I IFNs in a 

variety of human cells (Kelly JM et al (1986) EMBO J 5:1601-1606, the disclosure of which is 
incorporated herein by reference in its entirety). After induction, 6-16 mRNA constitutes as much as 
0.1% of the total cellular mRNA. The 6-16 mRNA is present at only very low levels in the absence 
of type I IFN, and is only weakly induced by type II IFN. The 6-16 mRNA encodes a hydrophobic 

30 protein of 130 amino acids. The first 20 to 23 amino acids comprise a putative signal peptide. 

Protein 6-16 has at least two predicted transmembrane regions culminating in a negatively charged 
C-terminus. 

The p27 gene encodes a protein with 41% amino acid sequence identity to the 6-16 protein. 
The p27 gene is expressed in some breast tumor cell lines and in a gastric cancer cell line. In other 
35 breast rumor cell lines, in the HeLa cervical cancer cell line, and in fetal lung fibroblasts, p27 

expression occurs only upon a-IFN induction. In one breast tumor cell line, p27 is independently 
induced by estradiol and by IFN (Rasmussen UB et al (1993) Cancer Res 53:4096-4101, the 
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disclosure of which is incorporated herein by reference in its entirety). Expression of p27 was 
analyzed in 21 primary invasive breast carcinomas, 1 breast cancer bone metastasis, and 3 breast 
fibroadenomas. High levels of p27 were found in about one-half of the primary carcinomas and in 
the bone metastasis, but not in the fibroadenomas. These observations suggest that certain breast 
5 tumors may produce high levels of, or have increased sensitivity to, type I IFN as compared to other 
breast tumors (Rasmussen UB et al, supra). In addition, the p27 gene expressed at significant levels 
in normal tissues including colon, stomach and lung, but not expressed in placenta, kidney, liver or 
skin. (Rasmussen UB et al, supra). 

The small hydrophobic IFI gene products may contribute to viral resistance. A hepatitis-C 

10 virus (HCV)-induced gene, 130-51 , was isolated from a cDNA library prepared from chimpanzee 
liver during the acute phase of the infection. The protein product of this gene has 97% identity to 
the human 6-16 protein (Kato T et al (1992) Virology 190:856-860, the disclosure of which is 
incorporated herein by reference in its entirety). The authors of this paper suggest that HCV 
infection actively induces IFN expression, which in turn induces expression of IFI genes including 

15 130-51. The IFI proteins synthesized in response to viral infections are known to inhibit viral 

functions such as penetration, uncoating, RNA or protein synthesis, assembly or release. The 130- 
5 1 protein may inhibit one or more of these functions in HCV. A particular virus may be inhibited 
in multiple functions by IFI proteins. In addition, the principle inhibitory effect exerted by EFI 
proteins differs among the virus families (Hardman JG, supra, p 121 1, the disclosure of which is 

20 incorporated herein by reference). 

The HIFI protein (PCT publication WO 9812223-A2, the disclosure of which is 
incorporated herein by reference in its entirety) is a human sequence identified among cDNAs from 
a library constructed from human neonatal kidney. Northern blot analysis using LIFESEQ™ 
database (Incyte Pharmaceuticas, Palo Alto, CA) shows that HIFI mRNA was found only in 

25 neonatal kidney. The HIFI protein consists of 104 amino acids and has 55%, 45%, and 46% amino 
acid sequence identity to p27, 6-16 and 130-51, respectively. 

The hydrophobic IFI proteins of the invention may provide the basis for clinical diagnosis 
of diseases associated with their induction. These proteins may be useful in the diagnosis and 
treatment of tumors, viral infections, inflammation, or conditions associated with impaired 

30 immunity. Furthermore, these proteins may be used for investigations of the control of gene 
expression by IFNs and other cytokines in normal and diseased cells. 

Based on the chemical and structural homology among the protein of SEQ ID NO:289 and 
the small hydrophobic IFI proteins from human and chimpanzee, it is believed that the protein of 
SEQ ID NO:289 is synthesized when interferons are produced in infections, inflammation, 

35 autoimmune diseases etc. Interferons are produced in response to various cytokines and growth 
factors, in viral infections, inflammation, autoimmune diseases, and cancers. Accordingly, the 
protein of SEQ ID NO:289 or fragments thereof may be used in diagnosis and treatment of diseases 
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such as, but not limited to, autoimmune disorders such as rheumatoid arthritis, Graves disease, 
systemic lupus erythematosus, autoimmune hepatitis, Wegener's granulomatosis, sarcoidosis, 
polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's syndrome, inflammatory 
bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, Type I diabetes, insulin- 
5 dependent diabetes mellitus, Lupus Nephritis, and allergic encephalomyelitis; proliferative disorders 
including various forms of cancer such as leukemias, lymphomas (Hodgkins and non-Hodgkins), 
sarcomas, melanomas, adenomas, carcinomas of solid tissue, hypoxic tumors, squamous cell 
carcinomas of the mouth, throat, larynx, and lung, genitourinary cancers such as cervical and 
bladder cancer, hematopoietic cancers, head and neck cancers, and nervous system cancers, benign 
10 lesions such as papillomas, atherosclerosis, angiogenesis; viral infections, in particular HCV and 
HIV infections, as well as other pathogen-induced infections (e. g. leishmania). 

The protein of SEQ ID NO:289 or fragments thereof may also be used to treat conditions 
associated with inflammation or immune impairment (e. g. reumathoid and osteo arthritis and 
AIDS). 

1 5 Another embodiment of the present invention relates to the use of the protein of SEQ ID 

NO:289 or fragments thereof to treat and/or prevent the ill-effect of bacterial infection during 
pregnancy in mammals, such as spontaneous abortion and maternal death. In a preferred 
embodiment, the protein of the invention may be used to counteract the effects of the bacterial 
endotoxin lipopolysaccharide (LPS). The methods for using such compositions is described in 

20 Dziegielewska and Andersen, Biol. Neonate, 74:372-5 (1998), the disclosure of which is 
incorporated herein by reference in its entirety. 

Furthermore, the protein of SEQ ID NO:289 or fragments thereof are useful as a reagent for 
analyzing the control of gene expression by interferons and other cytokines in both normal and 
diseased cells. The protein of the SEQ ID NO:289 or fragments thereof may be used to identify 

25 specific molecules with which it binds such as agonists, antagonists or inhibitors. 

Another embodiment of the present invention relates to methods of using the protein of 
SEQ ID NO:289 or fragments thereof to identify and/or quantify cytokines of the interferon family 
as well as other cytokines such as IL-10 and tumor antigens, which may interact with the protein of 
the invention. 

30 The protein of SEQ ID NO:289 or fragments thereof may also be included in 

pharmaceutical preparations for treating cancer or prevention/treatment of other diseases associated 
with changes in expression of the protein of the invention. In another embodiment of the present 
invention, the protein of SEQ ID NO:289 or fragments thereof is used to inhibit and/or modulate the 
effect of cytokines and related molecule such as 11-2, TNF alpha, CTLA4, CD28, and others, by 

35 preventing the binding of the endogenous cytokine to their natural receptors, thereby blocking cell 
proliferation or inhibitory signals generated by the ligand-receptor binding event. 
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The protein of SEQ ID NO:289 or fragments thereof is useful to correct defects in in vivo 
models of disease such as autoimmune, inflammation and tumor models, by injecting the protein 
either intra peritoneally intravenously, subcutaneously or directly into the diseased tissue. 

The DNA encoding the protein of SEQ ID NO:289 or fragments thereof is useful in 
5 diagnostic assays for conditions/diseases associated with expression of the protein of the invention. 
The diagnostic assay is useful to distinguish between absence, presence, and excess expression of 
the protein of the invention and to monitor regulation of levels of the protein of the invention during 
therapeutic intervention. The DNA may also be incorporated into effective eukaryotic expression 
vectors and directly targeted to a specific tissue, organ, or cell population for use in gene therapy to 

10 treat the above mentioned conditions, including tumors and/or to correct disease- or genetic-induced 
defects in any of the above mentioned proteins including the protein of the invention. The DNA 
may also be used to design antisense sequences and ribozymes, which can be administered to 
modify gene expression in tumor and pathogen-infected cells and to influence expression of 
cytokines and growth factors. In vivo delivery of genetic constructs into subjects can be developed 

15 to the point of targeting specific cell types, such as tumor where expression of the protein of the 
invention may be affected or is modulating the expression and/or activity of other proteins such as 
cytokines, growth factors, their receptors and/or tumor antigens. It is also useful to detect unknown 
upstream sequences (e. g. promoters and regulatory elements) by standard techniques and for 
research into the control of gene expression by interferons and other cytokines, as well as growth 

20 and transcription factors in normal and diseased cells. Hybridization probes are useful to detect 

DNA encoding the protein of the invention (or closely related molecules) in biological samples, and 
for mapping the naturally occurring genomic sequence to a particular chromosome/chromosome 
region. The DNA may be used to generate and/or treat in vivo animal models of disease, including 
susceptibility or resistance to infection, inflammation, tumors and autoimmune conditions, as well 

25 as tumor therapy, based on vaccine, knock-out and transgene technologies. 

Antibodies against the protein of SEQ ID NO:289 or fragments thereof are useful for the 
diagnosis of conditions and diseases associated with its expression and to quantify the protein of the 
invention (e. g. in assays to monitor patients during therapeutic intervention). Antibodies specific 
for the protein may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, 

30 Fab fragments produced by a Fab expression library. Neutralizing antibodies are especially 

preferred for diagnostics and therapeutics. Diagnostic assays for the protein of the invention include 
methods utilizing the antibody and a label to detect the protein of the invention in human body 
fluids or extracts of cells or tissues. 

The protein of the invention and its catalytic or immunogenic fragments or oligopeptides 

35 thereof, can be used for screening therapeutic compounds in any variety of drug screening 

techniques including high throughput. Methods which may be used to quantitate the expression of 
the nucleotide or protein of the invention include, but are not limited to, polymerase chain reaction 
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(PCR), RT-PCR, RNAse protection, Northern and western blotting, enzyme-linked immunosorbent 
asay (ELISA), radioimmunoassay (RIA), fluorescent activated cell sorting (FACS), 
immunoprecipitation, and chromatography. 

Accordingly, the present invention includes the use of the protein of SEQ ED NO:289, 
5 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be cancer, including breast 
cancer, viral infection, bacterial infection, inflammation, autoimmune disorders, rheumatoid 
arthritis, Graves disease, systemic lupus erythematosus, autoimmune hepatitis, Wegener's 

10 granulomatosis, sarcoidosis, polyarthritis, pemphigus, pemphigoid, erythema multiform, Sjogren's 
syndrome, inflammatory bowel disease, multiple sclerosis, myasthenia gravis keratitis, scleritis, 
Type I diabetes, insulin-dependent diabetes mellitus, Lupus Nephritis, and allergic 
encephalomyelitis; proliferative disorders including various forms of cancer such as leukemias, 
lymphomas (Hodgkins and non-Hodgkins), sarcomas, melanomas, adenomas, carcinomas of solid 

1 5 tissue, hypoxic tumors, squamous cell carcinomas of the mouth, throat, larynx, and lung, 

genitourinary cancers such as cervical and bladder cancer, hematopoietic cancers, head and neck 
cancers, and nervous system cancers, benign lesions such as papillomas, atherosclerosis, 
angiogenesis; viral infections, in particular HCV and HIV infections, as well as other pathogen- 
induced infections (e. g. leishmania). 

20 In such embodiments, the protein of SEQ ID NO:289, or a fragment thereof, is 

administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ED NO:289. The protein of SEQ ID NO:289 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ID NO:289 or a fragment thereof may be administered to the individual. Alternatively, an agent 

25 which increases the activity of the protein of SEQ ID NO:289 may be administered to the 

individual. Such agents may be identified by contacting the protein of SEQ ID NO:289 or a cell or 
preparation containing the protein of SEQ ID NO:289 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

30 Alternatively, the activity of the protein of SEQ ID NO:289 may be decreased by 

administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:289 may be identified by contacting the protein of 
SEQ ID NO:289 or a cell or preparation containing the protein of SEQ ID NO:289 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 

35 may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 
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In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, fetal brain, kidney, fetal kidney, or colon, or to distinguish between two or more 
possible sources of a sample on the basis of the level of the protein of SEQ ID NO:289 in the 
5 sample. For example, the protein of SEQ ID NO:289 or fragments thereof may be used to generate 
antibodies using any techniques known to those skilled in the art, including those described therein. 
Such antibodies may then be used to identify tissues of unknown origin, for example, forensic 
samples, differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate 
different tissue types in a tissue cross-section using immunochemistry. In such methods a sample is 

10 contacted with the antibody, which may be detectably labeled, under conditions which facilitate 
antibody binding. The level of antibody binding to the test sample is measured and compared to the 
level of binding to control cells from brain, fetal brain, kidney, fetal kidney, or colon or tissues other 
than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test sample is from 
brain, fetal brain, kidney, fetal kidney, or colon. Alternatively, the level of the protein of SEQ ID 

15 NO:289 in a test sample may be measured by determining the level of RNA encoding the protein of 
SEQ ID NO:289 in the test sample. RNA levels may be measured using nucleic acid arrays or 
using techniques such as in situ hybridization, Northern blots, dot blots or other technques familiar 
to those skilled in the art. If desired, an amplification reaction, such as a PGR reaction, may be 
performed on the nucleic acid sample prior to analysis. The level of RNA in the test sample is 

20 compared to RNA levels in control cells from brain, fetal brain, kidney, fetal kidney, or colon or 
tissues other than brain, fetal brain, kidney, fetal kidney, or colon to determine whether the test 
sample is from brain, fetal brain, kidney, fetal kidney, or colon. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:289, 

25 including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ED NO:289 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:289 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
support is washed and then the cells are released from the support by contacting the support with 

30 agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:289 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ED NO:289. In such techniques, the level of the protein of SEQ ID NO:289 in 
an ill individual is measured using techniques such as those described herein. The level of the 

35 protein of SEQ ID NO:289 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:289 which is associated 
with disease. 
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The protein of SEQ ID NO:268 is encoded by the cDNA of SEQ ED NO:27. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:268 
described throughout the present application also pertain to the polypeptide encoded by the human 
5 cDNA of clone 1 16-1 1 1-4-0-B3-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ID NO:27 described throughout the present application also pertain 
to the human cDNA of clone 116-11 1-4-0-B3-CS. The protein of the invention is found to be 
expressed in testis and lungs. 

The protein of SEQ ID NO:268 encoded by the extended cDNA SEQ ID NO: 27 is a 

10 splicing variant of XAGE-1, a member of the CT antigen family overexpressed in Ewing sarcoma 
(Liu, X. F., L. J. Helman, et aL (2000). Cancer Res 60(17): 4752-5, the disclosures of which are 
incorporated by reference herein in their entireties). In addition, the protein of SEQ ID NO:268 also 
shows strong homology at the COOH end with PAGE4, another member of the CT antigen family 
(Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the disclosure of which is 

15 incorporated herein by reference in its entirety). 

The cDNA SEQ ID NO:27 is composed of 5 exons. Exon 1 lies between nucleotides 1-245, 
exon2 lies between nucleotides 246-370, exon 3 lies between nucleotides 371-512, exon 4 lies 
between nucleotides 513-639, and exon 5 lies between nucleotides 640-762 . Exons 2 to 5 of cDNA 
SEQ ID NO:27 are shared in part with XAGE-1 . However, since the initiation codon of SEQ ED 

20 NO: 27 is located in intronl of XAGE-1, there is a frameshift in the alignment of the 2 molecules. 
Exon 1 of SEQ ID NO:27 lies between nucleotides 1 10-234 of XAGE-1, exon 2 of SEQ ID NO:27 
lies between nucleotides 235-376 of XAGE-1, exon 3 of SEQ ID NO:27 lies between nucleotides 
377-503 of XAGE-1 , and exon 4 of SEQ ID NO:27 lies between nucleotides 504-526 of XAGE-1 . 
XAGE-1 is overexpressed in sarcoma and alveolar rhabdomyosarcoma and is also highly 

25 expressed in normal testis (Liu, X. F., L. J. Helman, et al. (2000). Cancer Res 60(17): 4752-5, the 
disclosure of which is incorporated herein by reference in its entirety). In addition XAGE-1 share 
homology with PAGE-4 (Brinkmann, U., G. Vasmatzis, et al. (1999) Cancer Res 59(7): 1445-8, the 
disclosure of which is incorporated herein by reference in its entirety) at the COOH end. 

CT antigens are a distinct class of differenctiation antigens that are expressed by cancers 

30 arising in nonessential normal tissues such as prostate, breast, and ovary (G. Vasmatzis et al., Proc. 
Natl. Acad. Sci. USA, 95: 300-304, 1998, the disclosure of which is incorporated herein by 
reference in its entirety) and that have a restricted pattern of expression in normal tissues. This 
class of antigens are presented on the surface of tumor cells and are recognized by cytolytic T cells, 
leading to lysis. The extent to which these antigens have been studied, has been via cytolytic T cell 

35 characterization studies, in vitro i.e., the study of the identification of the antigen by a particular 
cytolytic T cell ("CTL" hereafter) subset. The subset proliferates upon recognition of the presented 
tumor rejection antigen, and the cells presenting the antigen are lysed. Characterization studies have 
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identified CTL clones which specifically lyse cells expressing the antigens. Examples of this work 
may be found in Levy et aL, Adv. Cancer Res. 24: 1-59 (1977); Boon et al., J. Exp. Med. 152: 1 184- 
1 193 (1980); Brunner et al., J. Immunol. 124: 1627-1634 (1980) ; Maryanski et al., Eur. J. 
Immunol. 124: 1627-1634 (1980); Maryanski et al., Eur. J. Immunol. 12: 406-412 (1982); Palladino 
5 et al., Cane. Res. 47: 5074-5079 (1987), the disclosures of which are incorporated herein by 
reference in their entireties. 

Some throughly studied CT antigens are MAGE, BAGE, GAGE and LAGE, others have 
been added including PAGE, XAGE, most of them located on chromosome X. Brinkmann et Al 
reported the identification of three new members of the GAGE/ PAGE family, termed XAGEs. 
10 XAGE-1 and XAGE-2 are expressed in E wing's sarcoma, rhabdomyosarcoma, a breast cancer, and 
a germ cell tumor. 

It is believed that the protein of SEQ ED NO:268 is a splicing variant of XAGE-1, a CT 
antigen overexpressed in Ewing sarcoma. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:268, 

15 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, associated with over or under expression of the 
protein of SEQ ID NO:268. In such embodiments, the protein of SEQ ID NO:268, or a fragment 
thereof, is administered to an individual in whom it is desired to increase or decrease any of the 

20 activity of the protein of SEQ ID NO:268. The protein of SEQ ED NO:268 or fragment thereof may 
be administered directly to the individual or, alternatively, a nucleic acid encoding the protein of 
SEQ ID NO:268 or a fragment thereof may be administered to the individual. Alternatively, an 
agent which increases the activity of the protein of SEQ ED NO:268 may be administered to the 
individual. Such agents may be identified by contacting the protein of SEQ ED NO:268 or a cell or 

25 preparation containing the protein of SEQ ED NO:268 with a test agent and assaying whether the 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ED NO:268 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 

30 with the activity of the protein of SEQ ED NO:268 may be identified by contacting the protein of 
SEQ ED NO:268 or a cell or preparation containing the protein of SEQ ED NO:268 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

35 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify tissues, preferably testis and 
lungs, or to distinguish between two or more possible sources of a tissue sample on the basis of the 
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level of the protein of SEQ ID NO:268 in the sample. For example, the protein of SEQ ID NO:268 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue-specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
5 that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 
cross-section using immunochemistry. In such methods a tissue sample is contacted with the 
antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 

10 sample is from testis or lungs. Alternatively, the level of the protein of SEQ ID NO:268 in a test 
sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:268 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic 

15 acid sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in 
control cells from testis or lungs or tissues other than testis or lungs to determine whether the test 
sample is from testis or lungs. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:268, 

20 including Ewing sarcoma cells, rhabdomyosarcoma cells, breast cancer cells and germ cell tumor 
cells using methods known to those skilled in the art. For example, an antibody against the protein 
of SEQ ID NO:268 or a fragment thereof may be fixed to a solid support, such as a chromatograpy 
matrix. A prepartation containing cells expressing the protein of SEQ ID NO:268 is placed in 
contact with the antibody under conditions which facilitate binding to the antibody. The support is 

25 washed and then the cells are released from the support by contacting the support with agents which 
cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:268 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
the protein of SEQ ID NO:268. In some embodiments, the protein of SEQ ED NO:268 or fragments 

30 thereof may be used to diagnose Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell 
tumors. In such techniques, the level of the protein of SEQ ID NO:268 in an ill individual is 
measured using techniques such as those described herein. The level of the protein of SEQ ID 
NO:268 in the ill individual is compared to the level in normal individuals. An elevated level or 
decreased level of the protein of SEQ ID NO:268 relative to normal individuals suggests that the ill 

35 individual is suffering from a defect in intercellular communication or secretion. 

Another embodiment of the invention relates to compositions and methods using the protein 
of SEQ ID NO:268 or a fragment thereof as possible targets for vaccine-based therapies of cancer, 
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including Ewing sarcoma, rhabdomyosarcoma, breast cancer or germ cell tumors. In such 
embodiments, an antibody against against the protein of SEQ ID NO:268 or a fragment thereof is 
administered to an individual suffering from cancer in an amount sufficient to ameliorate or 
eliminate the cancer. 

5 Protein of SEP ID NO:399 (internal designation H 60-40-1 -0-H4-CS) 

The protein of SEQ ID NO:399 is encoded by the cDNA of SEQ ID NO: 158. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:399 
described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 60-40-1 -0-H4-CS. In addition, it will be appreciated that all characteristics and 

10 uses of the nucleic acid of SEQ ID NO: 1 58 described throughout the present application also 

pertain to the human cDNA of clone 160-40-1 -0-H4-CS. The protein of the invention is found to be 
expressed in testis and lungs. It is over represented in fetal brain. 

The protein of SEQ ID NO:399 encoded by the cDNA of SEQ ID NO: 158 is homologous to 
proteins of the Phosphatic Acid Phosphatase type 2 (PAP2) superfamily (Stukey J. and Carman 

15 G.M., Protein Sci 1997;6 :469-472, the disclosure of which is incorporated herein by reference in its 
entirety). Three variants of human PAP, i.e. PAP-alpha 2 (W79285) and its alternatively spliced 
form PAP-alpha 1 (W79284), PAP-beta (W79286) and PAP-gamma (W79287) have been 
identified. The protein of SEQ ID NO:399 displays a pfam characteristic domain of the PAP2 
superfamily from positions 19 to 175. Accordingly, one embodiment of the present invention is a 

20 polypeptide comprising amino acid residues 19 to 175 of SEQ ID NO:399. Four membrane 
spanning domains are predicted from amino acid ositions 17 to, 47 to 67, 108 to 128, and 141 to 
161 . Accordingly, another embodiment of the present invention is a polypeptide comprising one or 
more of the foregoing membrane spanning domains. 

Phosphatidic acid phosphatase (PAP) (also referred to as phosphatidate phosphohydrolase) 

25 is known to be an important enzyme for glycerolipid biosynthesis. In particular, PAP catalyzes the 
conversion of phosphatidic acid (PA) into diacyl glycerol (DAG). PA and DAG are lipids involved 
in signal transduction and in structural membrane-lipid biosynthesis in cells, thus they represent an 
important regulatory point in eukariotic phospholipid metabolism. DAG is a well-studied lipid 
second messenger which is essential for the activation of protein kinase C (Kent; Anal. Rev. 

30 Biochem. ; 64 : 3 15-343; 1995; whereas PA itself is also a lipid messenger implicated in various 
signaling pathways such as NADPH oxidase activation and calcium mobilization (English; Cell 
Signal.; 8:341-347 ;1996, the disclosure of which is incorporated herein by reference in its entirety). 
The regulation of PAP activity can therefore affect the balance of divergent signaling processes that 
the cell receives in terms of PA and DAG (Brindley et al.; Chem.Phys. Lipids 80:45-57 ; 1996, the 

35 disclosure of which is incorporated herein by reference in its entirety). 
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PAP exists in at least two isoforms, one of which (PAP1) is presumed to be cytosolic and 
membrane associated and the other (PAP2) to be an integral membrane protein (Leung D.W., 
Tompkins C.K., White T. ; DNA Cell Biol. 17 : 377-385 (1998)). The protein of the invention has 
180 amino-acids and four predicted membrane-spanning segments, so is presumed to be an integral 
5 membrane protein. 

The protein of SEQ ED NO:399 is encoded by a cDNA that has homology to many forms 
of alternative splicing of PAP2 genes. For example, the protein of SEQ ID NO:399 has 29% 
homology with human phosphatidic acid phosphohydrolase type-2C protein. The protein of SEQ ID 
NO:399 also has 40% homology with human phosphatidic acid phosphatase 2B protein. In 

10 addition, the protein of SEQ ID NO:399 has 33% homology with human type 2 phosphatidic acid 
phosphatase alpha-2 protein. PAP2-alpha2 is one of the two isoforms with PAP2-alphal, presumed 
to be alternative splice variants from a single gene. 

Northern analysis has shown that PAP2-alpha mRNA expression was suppressed in several 
tumor tissues, indicating that PAP-2 may act as a tumor suppressor. The relationship of PAP and 

15 tumor suppression is further evidenced in findings that PAP activity is lower in fibroblast cell lines 
transformed with either the ras or fps oncogene than in the parental rati cell line (Brindley et al ; 
Chem. Phys. Lipids 80 : 45-57 ;1996, the disclosure of which is incorporated herein by reference in 
its entirety). As discussed above, a decrease in PAP activity in transformed cells correlates with a 
concomitant increase in PA concentration. Moreover, elevated PAP activity and lower levels of PA 

20 have been observed in contact-inhibited fibroblasts relative to proliferating and transformed 
fibroblasts (Brindley et al ; Chem. Phys. Lipids 80: 45-57; 1996, the disclosure of which is 
incorporated herein by reference in its entirety). Therefore, the protein of SEQ ID NO:399 or 
fragments thereof may be used to decrease cell division and as such can provide a useful tool in 
treating cancer. Subsequent analysis of colon tumor tissue derived from four donors confirmed 

25 lower expression of PAP2 -alpha than in matching normal colon tissue. Considering these data and 
previous demonstrations that certain transformed cell lines have lower PAP activity, human PAP 
cDNAs may be used for gene therapy for certain tumors (Leung D.W., Tompkins C.K., White 
T. ; DNA Cell Biol. 17 : 377-385 (1998), the disclosure of which is incorporated herein by reference 
in its entirety). Accordingly, one embodiment of the present invention is the use of the protein of 

30 SEQ ED NO:399 or a fragment thereof as a tumor suppressor. For example, a nucleic acid 
expressing the protein of SEQ ID NO: 3 99 or a fragment thereof may be introduced into an 
individual suffering from cancer in order to ameliorate or eliminate the cancer. In fact, nucleic 
acids encoding human phosphatidic acid phosphatases have been used to regulate levels of lipid 
cellular mediators and in gene therapy of e.g. cancer (PCT publication WO98/46730, the disclosure 

35 of which is incorporated herein by reference in its entirety). 

In another embodiment of the present invention, the protein of SEQ ID NO:399 or a 
fragment thereof can be used to control the balance of lipid mediators of cellular activation and 
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signal transduction. The protein of the invention has 33% homology with human phosphatidic acid 
phosphatase 2A protein. PAP2A is an integral membrane glycoprotein at the cell surface that plays 
an active role in the hydrolysis and uptake of lipids from the extracellular space (Roberts RZ, 
Morris AJ; Biochim Biophys Acta 2000 Aug 24;1487(l):33-49, the disclosure of which is 
5 incorporated herein by reference in its entirety). Accordingly, the level or activity of the protein of 
SEQ ID NO:399 may be modulated to influence the rate or extent of hydrolysis and uptake of lipids 
from the extracellular space using methods such as those described herein. 

In another embodiment of the present invention, the protein of SEQ ID NO:399 can be used 
to counterbalance the inflammatory response. PA has been implicated in cytokine induced 

10 inflammatory responses (Bursten et al; Circ. Shok 44: 14-29, 1994; Abraham et al; J. Exp. Med. 
181: 569-575, 1995; Rice et al; PNAS 91: 3857-3861, 1994; Leung et al; PNAS 92: 4813-4817, 
1995, the disclosures of which are incorporated herein by reference in their entireties) and the 
modulation of numerous protein kinases involved in signal transduction (English et al ; Chem. Phys. 
Lipids 80: 1 17-132, 1996, the disclosure of which is incorporated herein by reference in its 

15 entirety). In addition, a nucleic acid encoding the protein of SEQ ID NO:399 or a fragment thereof 
may be used to counterbalance the inflammatory response from cytokine stimulation through 
degradation of excess amount of PA in cells or to treat or ameliorate inflammatory diseases. 

The gene encoding the protein of SEQ ID NO:399 or a fragment thereof can also be used in 
gene therapy for the treatment of obesity associated with diabetes. PAP activity is decreased in the 

20 livers and hearts of the grossly obese and insulin resistant JCR:LA corpulent rat compared to the 
control lean phenotype (Brindley et al ; Chem. Phys. Lipids 80 : 45-57 ;1996, the disclosure of 
which is incorporated herein by reference in its entirety). The protein of the invention therefore can 
provide an important tool for the treatment of obesity associated with diabetes. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:399 , 

25 fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition, such as those listed above, in an individual. In such embodiments, the 
protein of SEQ ID NO:399 , or a fragment thereof, is administered to an individual in whom it is 
desired to increase or decrease any of the activities of the protein of SEQ ED NO:399, including 

30 glycerolipid biosynthesis, conversion of phasphatidic acid into diacylglycerol, signal transduction, 
membrane-lipid biosynthesis, activation of protein kinase C, NADPH oxidase activation, calcium 
mobilization, cell division, production of diacylglycerol, monoacylglycerol, ceramide or 
sphingosine, modulation of the inflammatory response or dephosphorylation of a substrate such as 
lysophasphatidic acid, ceramide 1 -phosphate, or sphingosine 1 -phosphate, or treatment or 

35 amelioration of obesity associated with diabetes. The protein of SEQ ID NO:399 or fragment 

thereof may be administered directly to the individual or, alternatively, a nucleic acid encoding the 
protein of SEQ ID NO:399 or a fragment thereof may be administered to the individual. 
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Alternatively, an agent which increases the activity of the protein of SEQ ED NO:399 may be 
administered to the individual. Such agents may be identified by contacting the protein of SEQ ID 
NO:399 or a cell or preparation containing the protein of SEQ ID NO:399 with a test agent and 
assaying whether the test agent increases the activity of the protein. For example, the test agent 
5 may be a chemical compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:399 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:399 may be identified by contacting the protein of 
SEQ ID NO:399 or a cell or preparation containing the protein of SEQ ID NO:399 with a test agent 

10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably brain, or 

1 5 to distinguish between two or more possible sources of a tissue sample on the basis of the level of 
the protein of SEQ ID NO:399 in the sample. For example, the protein of SEQ ID NO:399 or 
fragments thereof may be used to generate antibodies using any techniques known to those skilled 
in the art, including those described therein. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue that 

20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a tissue sample is contacted with the antibody, 
which may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 

25 Alternatively, the level of the protein of SEQ ID NO:399 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO:399 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 

30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:399 , 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:399 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A prepartation containing cells expressing the protein of SEQ ID NO:399 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:399 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ED NO:399. In some embodiments, the protein of SEQ ID NO:399 or fragments 
thereof may be used to diagnose cancer. In such techniques, the level of the protein of SEQ ID 
NO:399 in an ill individual is measured using techniques such as those described herein. The level 
of the protein of SEQ ID NO:399 in the ill individual is compared to the level in normal individuals. 
An elevated level or decreased level of the protein of SEQ ID NO:399 relative to normal individuals 
10 suggests that the ill individual may suffer from cancer or be predisposed to getting cancer in the 
future. 

In another embodiment, the present invention relates to methods of preparing a PAP protein 
of SEQ ED NO:399 comprising the steps of (i) transforming a host cell with an expression vector 
comprising a polynucleotide encoding SEQ ID NO:399, (ii) culturing the transformed host cells 

15 which express the protein and (iii) isolating the protein. The present invention also relates to a 
method of dephosphorylating a substrate comprising contacting the substrate with an effective 
amount of isolated protein of SEQ ID NO:399 or a fragment thereof such that the protein catalyzes 
the dephosphorylation of the substrate. It is further provided that this method occurs in vitro, and 
comprises a step of isolating the dephosphorylated substrate. Additionally, the method can occur in 

20 vivo, and is effected by the administration of the protein of the invention (or part of it) to a mammal 
in need thereof. 

Protein of SEP ID NOs:258 and 262 (internal designations 1 10-007- 1-0-C7-CS. 1 16-055-1 -0-A3- 
CSY 

The protein of SEQ ID NO:258 is encoded by the cDNA of SEQ ID NO: 17. Accordingly, 
25 it will be appreciated that all characteristics and uses of the polypeptide of SEQ ED NO:258 

described throughout the present application also pertain to the polypeptide encoded by the human 
cDNA of clone 1 10-007-1 -0-C7-CS. In addition, it will be appreciated that all characteristics and 
uses of the nucleic acid of SEQ ED NO: 17 described throughout the present application also pertain 
to the human cDNA of clone 1 10-007-1-0-C7-CS. The protein of SEQ ID NO:258 shows 
30 homologies to two high affinity IgE receptor-like proteins (IGER) with GENESEQP accession 
numbers W96745 and W41056, the disclosures of which are incorporated herein by reference in 
their entireties. The protein of SEQ ID NO:258 is expressed in liver and testis. The protein of SEQ 
ID NO:262, encoded by SEQ ED NO:21, is a variant of the protein of SEQ ID NO:258 and shares 
all the potential uses and functions described herein. This protein and cDNA share all of the 
35 characteristics and uses of the clone, and product thereof, 1 16-055-1-0-A3-CS). 
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Like the two high affinity IgE receptor-like proteins, the protein of the invention contains 
four transmembrane spanning domains of 20 amino acids, between amino acids 53-73, 79-99, 121- 
141 and 158-178, respectively. The protein of SEQ ID NO:258 crosses the plasma membrane four 
times forming two small extracellular loops and has both the N- or C- terminals in the cytoplasm. 
5 Moreover, the protein of the invention contains a signal peptide (cleavage site at position 21). 

The predicted structure of the protein of SEQ ID NO:258 demonstrates the relationship of 
this protein to FccRIp and CDC20 antigen and provides evidence for a family of 4-transmembrane 
spanning proteins. The conservation of amino acids between all three proteins is highest in the four 
transmembrane domains. While greater divergence exists in the hydrophilic amino and carboxyl 
10 termini, several amino acids within these regions are conserved such as the presence of 4 prolines in 
the amino terminus of all three proteins. In addition, two cysteine residues (position 147 and 156) 
are present in the second extracellular domain between TM3 and TM4. This suggests that inter- or 
intra-molecular di-sulfite bonds in this domain are present in all three proteins. 

FceRI, is part of a tetrameric receptor complex consisting of an a chain, a p chain and two y 
15 chains (Kinet et al. Proc Natl. Acad. Sci. USA, 15: 6483-6487 (1988), the disclosure of which is 
incorporated herein by reference in its entirety). Together, they mediate interaction with IgE-bound 
antigens leading to dramatic cellular responses, such as the massive degranulations of mast cells. 
The P subunit is a 4-transmembrane protein with both the amino and carboxyl termini residing in 
the cytoplasm. 

20 Chromosome mapping localized cDNA of SEQ ID NO:17 to chromosome 1 lql2, the 

location of the CD20 gene. However, the murine FceRip and Ly-44 (the murine equivalent of 
CD20) are both located in the same position in mouse in chromosome 19 (Teder, T.F. et al., J. 
Immunol. 141:4388-4394 (1988), Clark E.A. and Lane, J.L. Annu. Rev. Immunol. 9:97-127 (1991), 
the disclosures of which are incorporated herein by reference in their entireties). Therefore, the 

25 three genes are believed to have been originated and evolved from the same locus, further 
supporting the proposition that they are members of the same family of related proteins. 

On the basis of the foregoing information, it is believed that the protein of SEQ ID NO:258 
is a high affinity immunoglobulin E receptor-like protein. 

Atopic diseases, which include allergy, asthma, atopic dermatitis (or eczema) and allergic 

30 rhinitis are generally defined as a disorder of Immunoglobulin E (IgE) responses to common 

antigens, such as pollen or house dust mites. It is frequently detected by either elevated total serum 
IgE levels, antigen specific IgE response or positive skin tests to common allergens. In principle, 
atopy can result from dysregulation of any part of the pathway which begins with antigen exposure 
and IgE response to the interaction of IgE with its receptor on mast cell, the high affinity Fc 

35 receptor FceRI, and the subsenquent cellular activation mediated by that ligand-receptor 

engagement (Ra vetch, Nature Genetics, 7: 117-118 (1994), the disclosure of which is incorporated 
herein by reference in its entirety). 
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Accordingly, the protein of SEQ ED NO:258 or fragments comprising at least 5, 8, 10, 12, 
15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 consecutive amino acids thereof, or fragments 
having a desired biological activity may administered to an individual in whom it is desired to 
increase or decrease the activity of the protein of SEQ ID NO:258. In particular, the protein of SEQ 
5 ID NO:258 or fragment thereof may be administered to an individual in whom it is desired to 
regulate the extent of the IgE response. In such methods, the protein of SEQ ED NO:258 or 
fragment thereof may be administered directly to the individual or, alternatively, a nucleic acid 
encoding the protein of SEQ ID NO:258 or a fragment thereof may be administered to the 
individual. Alternatively, an agent which increases the activity of the protein of SEQ ID NO:258 

10 may be administered to the individual. Such agents may be identified by contacting the protein of 
SEQ ED NO:258 or a cell or preparation containing the protein of SEQ ID NO:258 with a test agent 
and assaying whether the test agent increases the activity of the protein. For example, the test agent 
may be a chemical compound or a polypeptide or peptide. 

The protein of SEQ ED NO:258 or fragments thereof may also be used to identify genes or 

15 polypeptides that may play a role in IgE responses or atopic disease. In particular, binding partners 
for the protein of SEQ ID NO:258 or the genes encoding such binding partners may be identified 
using a variety of techniques familiar to those skilled in the art, including the techniques described 
herein. 

The protein of SEQ ID NO:258 or the polynucleotide encoding the protein of SEQ ED 

20 NO:258 may also be used to diagnose hereditary atopy. In particular, the level of the protein of 
SEQ ID NO:258 may be determined in a test individual using methods such as those described 
herein and compared to the levels of normal individuals and individuals suffering from hereditary 
atopy to determine whether the test individual is suffering from or at risk of suffering hereditary 
atopy. Alternatively, a nucleic acid sample may be obtained from a test individual and analyzed to 

25 determine whether it contains a level of RNA encoding the protein of SEQ ID NO:258 which is 
associated with hereditary atopy or a mutation in the gene encoding the protein of SEQ ID NO:258 
which is associated with hereditary atopy. For example, a nucleic acid sample from the test 
individual may be contacted with a nucleic acid probe comprising the nucleic acid encoding the 
protein of SEQ ID NO:258 or a fragment thereof to determine the RNA level or whether the 

30 individual has a mutation associated with hereditary atopy. The probe may be either DNA, 

including cDNA or genomic DNA, or the probe may be RNA. Any of the methods familiar to those 
skilled in the art may be used in these diagnostic methods, including the methods described herein. 
For example, the presence of a mutation associated with hereditary atopy can be determined using 
methods generally known in the art, such as but not limited to PCR, sequencing or mini sequencing 

35 as described in the method of Yamamoto et al. (Biochem. Biophys. Res. Comm., 182:507 (1992), 
the disclosure of which is incorporated by reference herein in its entirety). 
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The protein of SEQ ID NO:258 can also be used to characterize the induction of expression 
of FceRI and the particular function of FceRI(3. As such, the protein of the invention can be useful 
in, for example, the design of drugs that block or inhibit induction or activity of FceRI, thereby 
treating atopic diseases. In particular, test agents which block or inhibit induction or activity may 
5 be identified using the methods described herein. 

In an other embodiment, the protein of SEQ ID NO:258 can be employed in the preparation 
of antibodies, such as monoclonal antibodies, according to methods known in the art, including 
those described herein. The antibodies can be used to block or mimic ligand binding to the receptor 
comprising the protein of the invention or other receptors, such as but not limited to FceRI. The 

10 antibodies can also be used to isolate the protein of SEQ ID NO:258 or cells which express the 
protein of SEQ ED NO:258 using methods such as those described herein. For example, the 
antibodies may be used to measure the presence of cells containing the protein of SEQ ID NO:258 
(including but not limited to hematopoietic cells) in a sample. For example, the method comprises 
contacting the sample with the antibody under conditions sufficient for the antibody to bind to the 

15 protein of SEQ ID NO:258 and detecting the presence of bound antibody using methods known in 
the art, including those described herein. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably liver and 
testis, or to distinguish between two or more possible sources of a tissue sample on the basis of the 

20 level of the protein of SEQ ED NO:258 in the sample. For example, the protein of SEQ ID NO:258 
or fragments thereof may be used to generate antibodies using any techniques known to those 
skilled in the art, including those described therein. Such tissue-specific antibodies may then be 
used to identify tissues of unknown origin, for example, forensic samples,differentiated tumor tissue 
that has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue 

25 cross-section using immunochemistry. In such methods a tissue sample is contacted with the 

antibody, which may be detectably labeled, under conditions which facilitate antibody binding. The 
level of antibody binding to the test sample is measured and compared to the level of binding to 
control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. Alternatively, the level of the protein of SEQ ID NO:258 in a test 

30 sample may be measured by determining the level of RNA encoding the protein of SEQ ID NO:258 
in the test sample. RNA levels may be measured using nucleic acid arrays or using techniques such 
as in situ hybridization, Northern blots, dot blots or other technques familiar to those skilled in the 
art. If desired, an amplification reaction, such as a PCR reaction, may be performed on the nucleic 
acid sample prior to analysis. The level of RNA in the test sample is compared to RNA levels in 

35 control cells from liver or testis or tissues other than liver or testis to determine whether the test 
sample is from liver or testis. 
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Protein of SEP IP NO:279 (internal designation 160-58-3-0-H3-CS) 

The protein of SEQ ID NO:279 is encoded by the cDNA of SEQ ID NO:38. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:279 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
5 acid included in clone 160-58-3-0-H3-CS. In addition, it will be appreciated that all characteristics 
and uses of the nucleic acid of SEQ ID NO:38 described throughout the present application also 
pertain to the nucleic acid included in clone 160-58-3-0-H3-CS. 

The protein of SEQ ID NO:279 is encoded by a nucleic acid of 1330 nucleotides with an 
ORF between nt 198 to 998 yielding a 267 amino acid protein. The protein is a polymorphic variant 

10 of the sequence (SP:P01210) for proenkephalin A precursor (contains Met- and Leu- enkephalins). 
It has a signal peptide spanning 24 amino acid and 2 signature motifs for vertebrate endogenous 
opioid neuropeptides and endogenous opioid neuropeptide precursors. PSORT gives a predicted 
extracellular localization, including the cell wall (66.7%). The protein of SEQ ID NO:279 is 
primarily distributed the fetal brain, although expression in other tissues has also been shown (see 

15 below). The polymorphic variation is found at amino acid position 75 (E->D, a conservative amino 
acid change). After signal peptide cleavage (amino acid 47 to 267; 220 amino acid), the protein still 
contains the polymorphic variation, which is now at amino acid position 29. This does not change 
any of the sequence of the different enkephalins that result after cleavage of this precursor protein. 
In addition, the polymorphism is 25 amino acids away from the first cleavage site on the amino 

20 terminal side. This is unlikely to change the secondary structure of the actual cleavage site. 

PCT publication WO9606863-A1, the disclosure of which is incorporated herein by 
reference in its entirety, discloses a protein having high homology with the protein of SEQ ID 
NO:279. Accordingly, the protein of SEQ ID NO:279 is believed to be an enkephalin. Met- 

and Leu- enkephalins compete with and mimic the effects of opiate drugs. These two pentapeptides 

25 with potent opiate agonist activity in bioassay systems were originally identified by Hughes et al 
(Nature, 258, 577-580, 1975). The natural ligands for opiate receptors, which differ only in their 
COOH terminal amino acid, were named Met- and Leu-enkephalin to reflect their origin from the 
brain. Peptides containing these sequences are termed opiate or opioid peptides. Enkephalins are 
widely distributed throughout the central nervous system in enkephalinergic neuronal networks, and 

30 also exist in the peripheral nervous system, for example in autonomic ganglia. Data, largely 
circumstantial, suggest wide-ranging involvement of endogenous opioids for example in the 
modulation of pain perception, in mood and behaviour, learning and memory, responses to stress, 
diverse neuroendocrine functions, immune regulation and cardiovascular and respiratory function. 

Met-enkephalin enhances the immune reaction in patients with cancer or AIDS. It can bind 

35 opoid receptors present in peripheral inflamed tissues to mediate an analgesic effect. 

After exogenous administration of the different enkephalins, several immunologic functions 
are affected, including antibody production, NK cell activity against tumors and viral infections, 
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macrophage and polymorphonuclear leukocyte functions, graft rejections, and mitogen-stimulated 
lymphocyte proliferation. The effects can be bi-directional, where low concentrations enhance, and 
high concentrations inhibit the same immune function. Thus, enkephalins are modulators of 
immune reactions. 

5 These opioid neuropeptides are released by post-translational proteolytic processing of 

precursor proteins. These multivalent precursor proteins (polyprotein) consist of a signal sequence 
followed by a conserved region of about 50 residues, a variable length region and the sequence of 
the various neuropeptides. The preproenkephalin A (gene PENK) is processed to produce the 
following peptides which include Met-enkephalin (6 copies, 2 of which are extended) and Leu- 
10 enkephalin: 

Signal peptide 1-24 
Peptide 100-104 Met-enkephalin 1 
Peptide 107-1 1 1 Met-enkephalin 2 
Peptide 136-140 Met-enkephalin 3 
15 Peptide 186-193 Met-enkephalin-arg-gly-leu 

Peptide 210-214 Met-enkephalin 4 
Peptide 230-234 Leu-enkephalin 
Peptide 261-267 Met-enkephalin-arg-phe 

The conserved region in the N-termini of these precursors contains six cysteines that are 
20 probably involved in disulfide bonds. This region could also be important for the processing of the 
neuropeptides. 

The precursor protein does have the potential to be differentially cleaved into multiple 
extended enkephalin and non-enkephalin-containing peptides, the functions of which are largely 
unknown; however, in some cases it has been shown that extended enkephalin-containing peptides 
25 have enhanced opiate activity. Another peptide, enkelytin, is produced that exhibits anti-bacterial 
activity (see below). 

There is a growing body of evidence that proenkephalin exists largely independently of free 
enkephalin peptides in a number of tissues and cell types including astrocytes (Melner et al, EMBO 
J, 9, 791-796, 1990; Spruce et al, EMBO J 9, 1787-1795, 1990, the disclosures of which are 

30 incoporated herein by reference in their entireties), and is released from these cells in an 

unprocessed form (Batter et al, Brain Res. 563, 28-32, 1991, the disclosure of which is incorporated 
herein by reference in its entirety). There is evidence in some cases that processing enzymes are co- 
released along with the unprocessed precursor which suggests that extracellular cleavage may occur 
(Vilijn et al, J. Neurochem. 53, 1487-1493, 1989). Even if biological activity is signalled through 

35 binding of the small peptide products to cell surface receptors, the regulation of this activity may be 
mediated through the precursor, and it is also possible that the unprocessed precursor has an 
additional intracellular role of its own. 
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This protein was originally described to be present in various brain regions, most notably in 
the striatum as well as in neuroendocrine tissues, the pituitary and adrenal gland. It is also 
expressed in a variety of immune cells, including ConA-stimulated CD4 Tlymphocytes, CD4 
thymocytes, B lymphocytes, as well as T cell lines, macrophages and mast cells. Expression has 
5 been reported in the reproductive system, heart and many developing tissues during gestation and 
early postnatal period Because of this, it has been postulated that these peptides play a role in cell 
or tissue growth and differentiation. For example, endogenous enkephalins induced in thymocytes 
modulate their own expression and function to inhibit the proliferation of activated thymocytes. 
Enkephalin peptides are abundant in adrenal medulla and can be released by 
10 neurotransmitters specific for that tissue. Enkephalins have also been found to be abundant in 
human phaeochromocytoma, a tumour derived from the adrenal medulla. The RNA from this 
tumour contains a high level of enkephalin mRNA sequences as demonstrated by cell-free 
translation studies. 

Enkephalins function as opiate receptors are classified as delta, kappa and mu. A study by 

15 Lord et al (Nature, 267, 495-499, 1977) compared the activity of morphine and enkephalins in 
bioassay systems, and found that enkephalins bound predominantly to delta receptors. Subsequent 
studies have revealed homology of these receptors to other receptor families, including the 
immunoglobulin superfamily member OBCAM (Schofield et al, EMBO J 8, 489-495, 1989, the 
disclosure of which is incorporated herein by reference in its entirety) and somatostatin receptors 

20 (PCT publication WO96/06863, the disclosure of which is incorporated herein by reference in its 
entirety). This would explain the reported opioid binding properties of the former. Because of the 
latter's homology to opiate receptors, it would also be expected to bind opioid receptor ligands. 
The recognition of opioid peptides by other non-opiate related receptors implies that these peptides 
may exert other as yet unknown functions. 

25 Enkephalins are also involved in apoptosis. Apoptosis is the morphologically distinct 

process of controlled cell death which balances the process of cell production by mitosis. A 
molecular connection between control of cell production and cell elimination has now been 
established, including the roles of c-myc and p53 in the pathways mediating apoptotic cell death. It 
has been proposed that all mammalian cells may be programmed to die by default in the absence of 

30 continuous signalling from neighboring cells. However, the acquisition of a survival advantage 
which prevents a single cell from activating its suicide program in response to levels of genetic 
damage associated with common environmental insults could theoretically be an initiating event in 
oncogenesis since it would favor the persistence of potentially tumorigenic mutations. 
Alternatively, inappropriate activation of survival pathways might lead to overriding the intrinsic 

35 death program and promote tumorigenesis at early and late stages. A particularly potent oncogenic 
pathway would be one which both promoted and tolerated genetic damage and helped a cell 
overcome its need for extracellular survival signals. Approximately 50% of human tumors possess 
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normal p53 function. Thus, additional pathways or molecules which inappropriately repress 
apoptosis in human tumours remain to be identified. Opioid-like molecules could be involved in 
such a pathway. 

There are published reports that pathways which include opioid-like molecules participate 

5 in regulating the equilibrium between cell death and survival. For example, morphine inhibits cell 
survival in the developing cerebellum (Hauser et al, Exp. Neurol, 130, 95-105, 1994, the disclosure 
of which is incorporated herein by reference in its entirety) and induces apoptosis in thymocytes 
(Fuchs and Pruett, J. Pharmacol. Exp. Ther. 266, 417-423, 1993, the disclosure of which is 
incorporated herein by reference in its entirety). 

10 In a series of experiments (PCT publication WO 96/06863), it has been found that 

proenkephalin and/or its proteolytic products act as extracellular and/or cell surface membrane 
bound factors which modulate cell survival in transformed cells a) upon deprivation of exogenous 
survival factors, and b) following genotoxic injury and/or stress when exogenous survival factors 
are non-limiting. The receptor(s) to which these factor(s) bind, which are most likely to exist on the 

15 cell surface are related, or possibly identical, to one or more members of the opioid receptor family. 
Opioid-like receptor types or subtypes can mediate survival or death; receptor (s) 
whichmediate death appear to be coupled to those which mediate survival. Natural ligands for these 
receptors are likely to be products of the opioid precursor genes, although natural ligands could 
include cytokines which mimic their effect. Tumour cells are more sensitive to antagonism of 

20 opioid-like receptor-mediated survival, and to stimulation of opioid-like receptor-mediated death, 
than non-transformed cells. The induction of cell cycle arrest enhances the sensitivity of rumour 
cells to thesemanipulations. (Enhanced sensitivity of tumour cells to these manipulations is induced 
by their synchronisation within the cell cycle. 

Cytoplasmic proenkephalin and/or its proteolytic products act as general repressors of 

25 apoptosis. Agents which, if coupled to appropriate internalisation agents, would antagonise 
cytoplasmic proenkephalin would therefore be of use in the induction of apoptosis in 
non-transformed as well as transformed cells, particularly in combination with sublethal doses of 
known apoptosis-inducing agents. 

The repression of apoptosis mediated through cytoplasmic proenkephalin is activated at 

30 high cell density predominantly by nondiffusable factors. Inhibition of proenkephalin or its products 
as described above would therefore be potentiated if agents were used in combination for example 
with neutralising antibodies to integrins (such as the antibody 23C6- Bates et al., J. Cell Biol. 125 
403-415, 1994) to reduce exogenous survival signaling and simulate low density. 

Proenkephalin targeted to the cell nucleus induces apoptotic death, which is inhibited by the 

35 overexpression of large T antigen and is at least partly mediated through p53. Tumors which retain 
wild-type p53 function are therefore a particular target for apoptosis induction by agents which 
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increase the levels of proenkephalin, or its derivatives, within the nucleus or which mimic the 
function of nuclear proenkephalin or its derivatives. 

Accordingly, the protein of SEQ ID NO:279, fragments thereof, or nucleic acids encoding 
the protein of SEQ ED NO:279 may be used to modulate a biochemical pathway in which products 
5 of opioid peptide precursor genes participate. In some embodiments, antibodies or other agents 
which reduce the level or activity of the protein of SEQ ID NO:279 or fragments thereof may be 
used to induce apoptosis in cells. The agents preferably neutralize the protein of SEQ ID NO:279 
or its proteolytic derivatives, increase the level of, activate or mimic nuclear proenkephalin, or act 
as an antagonist to receptors related or identical to the delta and kappa opioid receptors. In some 

10 embodiments, the agent may be a neutralizing monoclonal antibody against the protein of SEQ ID 
NO:279 or a fragment thereof. The agent may also be a fragment or allelic form of one of these 
antibodies. A cytoplasmic anchor, or a nuclear localization signal may also be included in the 
agent. In some embodiments, the agent is able to modulate a biochemical pathway in a cell in 
which products of opioid peptide precursor genes participate in order to induce apoptosis. The 

15 agents can be used for the treatment of cancer or for inducing apoptosis in lens cells following a 
cataract operation. In some embodiments, the agents promote apoptosis of proliferating cells with 
less, or no, effect on normal mature cell types. The agents may be administered in combination 
with a genotoxic or cell cycle arrest agent. Alternatively, the agent may be complexed with a 
chemotherapeutic, irradiation or cell cycle arrest (synchronization agent). 

20 Accordingly, the invention provides a means of inducing apoptosis in cells which comprises 

modifying a biological pathway of a cell in which a product of an opioid precursor gene participates 
in such a way that apoptosis is induced. Modification of the pathway is suitably effected by 
adminstration of an appropriate agent. In particular, the present invention provides an agent for use 
in inducing apoptosis in cells, said agent comprising an agent able to neutralise proenkephalin or its 

25 proteolytic derivatives; an agent which increases the level of nuclear proenkephalin and/or its 
derivatives, or which activates or mimics them an agent which acts as an antagonist at receptor(s) 
related or identical to the delta opioid receptor, or an agent which acts as an agonist at receptor(s) 
related or identical to the kappa opioid receptor. 

A subset of such agents are agents able to neutralise proenkephalin or its proteolytic 

30 derivatives, or an agent which acts as an antagonist at receptor(s) related or identical to the delta 
opioid receptor, or an agent which acts as an agonist at receptor(s) related or identical to the kappa 
opioid receptor. 

In some embodiments, the agent may be administered to the cell surface whereupon the 
survival effects of extracellular and/or cell surface membrane bound proenkephalin or its proteolytic 
35 derivatives is neutralised causing the cell to become apoptotic. Alternatively, an agent able to 
neutralise proenkephalin or its proteolytic derivatives may be coupled to an internalisation peptide 
and a cytoplasmic anchor. Such an assembly will remain in the cytoplasm of the cell, antogonising 
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cytoplasmic pToenkephalin and/or its proteolytic products and thus neutralising the apoptosis 
repressor effect of these molecules. 

Enkephalins also have anti-bacterial activity. During processing of the proenkephalin-A, 
the maturation in the adrenal medullary chromaffin cell starts with the removal of the carboxy- 
5 terminal end (proenkephalin-A-derived peptide or PEAP209-239) (Y. Goumon, K. Lugardon, B. 
Kieffer et al. J. Biol. Chem. 273:29847-29856, 1998, the disclosure of which is incorporated 
herein by reference in its entirety). The peptide enkelytin was identified as corresponding to 
bisphosphorylated PEAP209-237, a°d possesses antibacterial activity including Staphylococcus aureus 
and other gram-positive bacteria such as Micrococcus luteus and Bacillus megaterium (0.2-0.4 uM 

10 range). There is no ability to affect gram-negative bacteria (E. coli strain D22, D3 1, 663 and 
T13773) growth, nor is there any hemolytic activity. The activity of this peptide is specific - 
shorter versions of the peptide (209-220, 224-237, 230-237, 233-237) or non-phosphorylated 
PEAP209-239 exhibited little to no bacterial growth inhibiting activity. 

Bovine periarthritis abscess fluid contains different forms of PEAP (72-237/239; 80- 

1 5 237/239) as identified by immunoreactivity and confirmed by sequence analysis. These peptides 
have activity against M. luteus, but are less active than enkelytin (5 versus 0.2 uM). These PEAP 
constitute a pool of precursors which have to be processed, during infection, to provide active 
enkelytin. Presence of a PEAP at a molecular mass corresponding to that of PEAP 20 9- 2 37 was 
detected as well. PEAPs (PEAP202-238 and PEAP206-237) have also been detected in wound fluids, 

20 including bovine post-caesarean abscess in the subcutaneous lining, and an abscess induced by 
subcutaneous injection of complete Freund's adjuvant. Therefore, these peptides are present in 
wound fluids along with other known antibacterial peptides (defensins, bactenecins). The 
concentrations were in a range similar to that found to be active in vitro (0.5-1 uM). The PEAPs 
have also been detected in secretions from human polymorphonuclear neutrophils. 

25 The PEAP209-230 and enkelytin are secreted from cultured chromaffin cells following 

stimulation. This suggests that these two peptides are co-released with catecholamines in stress 
situations and may therefore play an important role in defense mechanisms. 

Co-release of met-enkephalin and enkelytin would represent a unified neuroimmune 
protective response to stress situations that may be accompanied with infectious diseases. This 

30 would provide a highly beneficial survival strategy at the very begninning of proinflammatory 
processes. This protein would therefore play an important role in host defense against microbial 
infections, especially those involving gram positive bacteria. Due to their nonspecific activity on 
membranes, the antibacterial peptides possess cytotoxic activities and may not only play a role in 
antimicrobial defense, but also in inflammatory processes, possibly in wound repair. 

35 The protein of SEQ ID NO:279, peptides derived by cleavage thereof or fragments thereof 

could be used as antibacterial agents in creams/ointments/solutions, presoaked bandages, or dermal- 
type patches for external applications. Alternatively, the protein of SEQ ID NO:279, peptides 
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derived by cleavage thereof, or fragments thereof may be used in injections (intravenously, 
subcutaneously or intra-peritoneally). This is useful for wound repair, burn healing, post-operative 
recovery management. 

Alternatively, the protein of SEQ ID NO:279, peptides derived by cleavage thereof, or 

5 fragments thereof, may be incorporated into disinfectant solutions used for cleaning surfaces such 
as in the the house (kitchen, bathroom) or in the office (desktops, phones, computer keyboards and 
mouse). Other applications are as additives in mouthwash or handi-popup wipes. 

Altered levels of enkephalins may produce psychological disease. Konig et al (Nature, 383, 
535-538, 1996, the disclosure of which is incorporated herein by reference in its entirety) used a 

10 genetic approach to study the role of the mammalian opioid system. They disrupted the pre- 
proenkephalin gene using homologous recombination in embryonic stem cells to generate 
enkephalin-deficient mice. Mutant enk -/- animals are healthy, fertile, and care for their offspring, 
but display significant behavioral abnormalities. Mice with the enk -/- genotype are more anxious 
and males display increased offensive aggressiveness. Mutant animals show marked differences 

15 from controls in supraspinal, but not in spinal, responses to painful stimuli. These enk -/- mice do 
however exhibit normal stress-induced analgesia. Therefore, enkephalins modulate responses to 
painful stimuli. Thus, genetic factors may contribute significantly to the experience of pain. This 
study clearly indicates the importance of enkephalins in pain perception, anxiety and 
aggressiveness. 

20 Interestingly, the PENK gene is localized on 8q23-q24, the same locus on which are found 

genes related to epilepsy and spastic paraplegia, disorders related to brain dysfunction. 

Accordingly, the protein of SEQ ID NO:279 or fragments thereof may be used for the 
treatment of psychological disorders, especially those involving distortion in the perception of pain, 
aggressiveness, or anxiety. This would include drug addiction, different types of phobias, panic 

25 attacks, schizophrenia, bi-polar, anorexia nervosa, chronic pain disorders, post-traumatic events, 
post-operative pain management. 

Accordingly, the present invention includes the use of the protein of SEQ ID NO:279, 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 
consecutive amino acids thereof, or fragments having a desired biological activity to treat or 

30 ameliorate a condition in an individual. For example, the condition may be cancer, a condition 
resulting from increased or decreased cellular proliferation, bacterial infection, conditions resulting 
from abnormal immune responses, psychological disease or any of the conditions listed above. In 
such embodiments, the protein of SEQ ID NO:279, or a fragment thereof, is administered to an 
individual in whom it is desired to increase or decrease any of the activities of the protein of SEQ 

35 ID NO:279. The protein of SEQ ID NO:279 or fragment thereof may be administered directly to 
the individual or, alternatively, a nucleic acid encoding the protein of SEQ ID NO:279 or a 
fragment thereof may be administered to the individual. Alternatively, an agent which increases the 
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activity of the protein of SEQ ID NO:279 may be administered to the individual. Such agents may 
be identified by contacting the protein of SEQ ID NO:279 or a cell or preparation containing the 
protein of SEQ ID NO:279 with a test agent and assaying whether the test agent increases the 
activity of the protein. For example, the test agent may be a chemical compound or a polypeptide 
5 or peptide. 

Alternatively, the activity of the protein of SEQ ID NO:279 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
with the activity of the protein of SEQ ID NO:279 may be identified by contacting the protein of 
SEQ ID NO:279 or a cell or preparation containing the protein of SEQ ID NO:279 with a test agent 

10 and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

In one embodiment, the invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 

15 example, fetal brain, or to distinguish between two or more possible sources of a sample on the 
basis of the level of the protein of SEQ ID NO:279 in the sample. For example, the protein of SEQ 
ID NO:279 or fragments thereof may be used to generate antibodies using any techniques known to 
those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 

20 has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 
antibody binding to the test sample is measured and compared to the level of binding to control cells 
from fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal 

25 brain. Alternatively, the level of the protein of SEQ ID NO:279 in a test sample may be measured 
by determining the level of RNA encoding the protein of SEQ ID NO:279 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 
Northern blots, dot blots or other techniques familiar to those skilled in the art. If desired, an 
amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 

30 to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
fetal brain or tissues other than fetal brain to determine whether the test sample is from fetal brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 
used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:279, 
including using methods known to those skilled in the art. For example, an antibody against the 

35 protein of SEQ ID NO:279 or a fragment thereof may be fixed to a solid support, such as a 

chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO:279 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 
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support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 

In another embodiment of the present invention, the protein of SEQ ID NO:279 or a 
fragment thereof thereof may be used to diagnose disorders associated with altered expression of 
5 the protein of SEQ ID NO:279. In such techniques, the level of the protein of SEQ ID NO:279 in 
an ill individual is measured using techniques such as those described herein. The level of the 
protein of SEQ ID NO:279 in the ill individual is compared to the level in normal individuals to 
determine whether the individual has a level of the protein of SEQ ID NO:279 which is associated 
with disease. 

10 Protein of SEP ID NO: 293 (internal designation 181-1 6-1 -0-G7-CS^ 

The protein of SEQ ID NO: 293 has a high degree of homology with HSPC163 (Genbank 
accession number AF161512), the protein encoded by gene no: 93 (PCT/US99/17130) and the 
human cornichon protein TGAM77. SEQ ID NO: 293 is overexpressed in cancerous prostate, fetal 
brain and fetal kidney. 

15 The gene HSPC163 is one of three hundred cDNAs obtained from CD34+ hematopoietic 

stem / progenitor cell (HSPC) library (obtained from umbilical cord blood and adult bone marrow). 
HSPC163 has also been in identified in five hematopoietic cell lines: NB4 (granulocytic), HL60 
(granulocytic), U937 (monocytic), K562 (erythro-megakaryocytic), and Jurkat (T lymphocytic). 
These cell lines represent the distinct lineages of hematopoietic cells. 

20 The polypeptide of gene no: 93 has been determined to have two transmembrane domains 

and a short cytoplasmic tail. Based upon these characteristics, it is believed that the protein product 
of gene no: 93 shares structural similarity to type Ilia membrane proteins. This gene is expressed 
primarily in activated T-cells and to a lesser extent in endometrial tumor, T cell helper II cells, 
microvascular endothelial cells, Raji cells treated with cyclohexamide and umbilical vein 

25 endothelial cells. The expression pattern of gene no: 93, indicates a role in regulating the 
proliferation, survival, differentiation, and/or activation of hematopoietic cell lineages, including 
blood stem cells. The gene product appears to be involved in the regulation of cytokine production, 
antigen presentation, and other immune processes, suggesting a usefulness in boosting the immune 
system. The translation product of this gene has high homology to the human TGAM77 and mouse 

30 cornichon proteins. 

TGAM77 was identified as a gene involved in early phase of T-cell activation in response 
to alloantigens. Twenty four hours after T-cell allostimulation, RNA expression of TGAM77 is 
significantly increased. TGAM77 has been designated as a T-cell growth associated molecule. 
TGAM77 is a human homolog of cornichon (cni) protein of the fruit fly Drosophila. 

35 Cornichon was demonstrated to 'be involved in carefully orchestrated signaling events 

during Drosophila oogenesis establishing an asymmetric pattern in the oocyte as a prerequisite for 
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correct embryogenesis. Cornichon signaling functions in concert with two other proteins. The 
function of all three genes in an EGF-like signaling pathway appears to direct the formation of a 
correctly polarized microtubule cytoskeleton, which is thought to be the basis for the correct spatial 
localization of other singaling molecules essential for oocyte polarization, asymmetric movement 
5 of the nucleus, and embryo differentiation. 

The subject invention provides the amino acid sequence of SEQ ID NO: 293 and 
polynucleotide sequences encoding the amino acid sequence of SEQ ED NO: 293 . In one 
embodiment, the polypeptides of SEQ ID NO: 293 are interchanged with the corresponding 
polypeptides encoded by the human cDNA of clone 181-1 6-1 -0-G7-CS. Also included in the 

10 invention are biologically active fragments of SEQ ED NO: 293 and polynucleotide sequences 
encoding these biologically active fragments. "Biologically active fragments" are defined as those 
peptide or polypeptide fragments of SEQ ED NO: 293 which have at least one of the biological 
functions of the full length protein (e.g., the ability to stimulate T-cell proliferation). 

The invention also provides variants of SEQ ID NO: 293 . These variants have at least 

15 about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 293. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 293, 
such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 

20 practiced utilizing SEQ ID NO: 293 or variants thereof. Likewise, the methods of the subject 
invention can be practiced using biologically fragments of SEQ ID NO: 293, or variants of said 
biologically active fragments. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode SEQ ID NO: 293 . It is well within the skill of a person trained in the art to create these 

25 alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same" sequence refers to sequences that have amino 
acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: 293 are also 

30 included in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 

35 viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

SEQ ED NO: 293 protein, and variants thereof, can be used to produce antibodies according 
to methods well known in the art. The antibodies can be monoclonal or polyclonal. Antibodies can 
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also be synthesized against fragments of SEQ ID NO: 293 as well as variants of SEQ ID NO: 293 
according to known methods. The subject invention also provides antibodies which specifically 
bind to biologically active fragments of SEQ ID NO: 293 or biologically active fragments of SEQ 
ID NO: 293 variants. 

5 The subject invention also provides for immunoassays which are used to screen for, 

monitor, or diagnose prostate cancer. Methods of screening for, diagnosing, identifying, or 
monitoring the course of prostate cancer are well known to those skilled in the art. In this aspect of 
the invention, immunoassays are provided which contact a biological sample (e.g., blood, serum, 
tissue, or biopsied tissue sample) with antibodies which specifically bind to SEQ ID NO: 293 , 

10 immunogenic fragments of SEQ ID NO: 293 , or biologically active fragments of SEQ ID NO: 293 
. Immunocomplexes formed in the contacting step are then detected using an appropriately labeled 
detection reagent. The levels of SEQ ID NO: 293 expressed in the tested biological samples are 
compared to control/normal levels typically observed in the population. 

Alternatively, methods which screen for, monitor, or diagnose prostate cancer may be 

15 practiced with SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 , as well as nucleic acids 
encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 . In one embodiment, the 
polypeptide may be used as a standard/control immunoassays described above. In another 
embodiment, the nucleic acids encoding SEQ ID NO: 293 , or fragments of SEQ ID NO: 293 are 
used in hybridization assays, well known to the skilled artisan, to identify biological samples (e.g., 

20 blood, serum, tissue, or biopsied tissue sample) which contain SEQ ED NO: 293 . The levels of 
SEQ ID NO: 293 expressed in the tested biological samples are compared to control/normal levels 
typically observed in the population. 

In another embodiment, SEQ ID NO: 293 , and polynucleotide sequences encoding the 
amino acid sequence of SEQ ID NO: 293 can be used to identify or diagnose immune disorders 

25 involving activated T-cells using standard hybridization assays. 

Another aspect of the invention provides methods of immunostimulating a mammal. In this 
aspect of the invention, SEQ ED NO: 293 , and/or polynucleotide sequences encoding the amino 
acid sequence of SEQ ID NO: 293 , are introduced into T-cells according to well known methods. 
T-cells are, then activated by stimulation with antigen to induce the immune system of the mammal. 

30 In another embodiment, autologous T-cells are obtained from an individual. SEQ ID NO: 

293 , biologically active fragments thereof, and/or polynucleotide sequences encoding the amino 
acid sequence, or biologically active fragments, of SEQ ID NO: 293 , are introduced into these 
autologous T-cells according to well known methods. The T-cells are expanded and reintroduced 
into the individual from which the T-cells were obtained. See, for example U.S. Patent Nos. 

35 5,192,537 and 5,766,920 , hereby incorporated by reference in their entirety. 

In another embodiment of the subject invention, polynucleotides and polypeptides 
encoding SEQ ID NO: 293 , can be used to expand stem cells, committed progenitors of various 
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blood lineages, and in the differentiation and/or proliferation of various cell types. In this aspect of 
the invention, polynucleotides and polypeptides encoding SEQ ID NO: 293 are introduced into the 
cells and the cells cultured. These methods may be practiced according to methods well known to 
the routineer. 

5 Protein of SEQ ID NO:316 (internal designation 1 88-45- 1-0-D9-CS) 

The protein of SEQ ID NO:316 is encoded by the cDNA of SEQ ED NO:75. Accordingly, 
it will be appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:316 
described throughout the present application also pertain to the polypeptide encoded by a nucleic 
acid included in clone 1 88-45-1 -0-D9-CS. In addition, it will be appreciated that all characteristics 

10 and uses of the nucleic acid of SEQ ID NO:75 described throughout the present application also 
pertain to the nucleic acid included in clone 1 88-45-1 -0-D9-CS. 

The protein of SEQ ID NO:316 is expressed in brain and contains three membrane- 
spanning segments located between amino acid positions 6 and 26, 73 and 93, or 139 and 159 and a 
signal peptide comprising the sequence FAAFCYMLSLVLC/AA. Accordingly, one embodiment 

15 of the present invention is a polypeptide comprising one or more of the membrane-spanning 
segments, and/or the signal peptide. 

The protein of SEQ ID NO:316 is a member of the cornichon protein family. It has 48% 
identity with the Drosophila melanogaster cornichon protein as well as 67% identity with the 
Human Cornichon homolog TGAM77 (Genbank accession No. AF 104398, the disclosure of which 

20 is incorporated herein by reference in its entirety), 67% identity with hCornichon, a bone marrow 
secreted protein (PCT publication WO/9933979, the disclosure of which is incorporated herein by 
reference in its entirety), 67% identity with a human secreted protein encoded by gene 24 (PCT 
publication WO/9910363, the disclosure of which is incorporated herein by reference in its entirety) 
and 67% identity with the protein product of the mouse cnih gene. However, this protein has higher 

25 homology, 81% identity, to the mouse cornichon-like protein (Genbank accession No. AB006191, 
the disclosure of which is incorporated herein by reference in its entirety), which is the product of 
the mouse cnil gene. Finally, the protein of SEQ ID NO:3 16 has a high level of identity with 
human secreted protein encoded by gene 95 (GSP:Y7621 8, PCT publication WO/9958660, the 
disclosure of which is incorporated herein by reference in its entirety) and is likely a polymorphic 

30 varient of gene 95. The high degree of sequence conservation between the members of this family 
indicates that they are under strong selective pressure and are likely involved in important cellular 
functions. 

The Drosophila cornichon (cni) gene product is involved in signaling processes necessary 
for both anterior-posterior and dorsal-ventral pattern formation during Drosophila embyrogenesis 
35 (Cell, 1995, 81:967-978). Mutations in cornichon prevent the formation of a correctly polarized 
microtubule cytoskeleton in the oocyte. Cni signaling functions in concert with two other proteins. 
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Gurken, which is a protein secreted from the oocyte containing a single epidermal growth factor 
(EGF) motif most similar in structure to vertebrate TGFa, is considered to be the ligand of the 
Drosophila epidermal growth factor receptor (DER) homolog torpedo, which is expressed by the 
follicular epithelium. The function of all three genes in an EGF-like signaling pathway appears to 
5 direct the formation of a correctly polarized microtubule cytoskeleton, which is thought to be the 
basis for the correct spatial localization of other signaling molecules essential for oocyte 
polarization, asymmetric movement of the nucleus, and embryo differentiation. TGAM77, one of 
the human homologs of comichon, is differently expressed in alloactivated T-cells (Bioch. Biophys. 
Acta 1999, 1449:203-210, the disclosure of which is incorporated herein by reference in its 
10 entirety). Since there is a well-known involvement of the microtubule cytoskeleton in spatial 
polarization of signaling events in T-cell activation, it is thought that TGAM77 may function in a 
protein-tyrosine kinase pathway required for the vectorial localization of signaling molecules in T- 
cell activation. 

The protein of SEQ ID NO:316 is found in brain tissue, and gene 95 (GSP:Y76218, PCT 
15 publication WO/9958660, the disclosure of which is incorporated herein by reference) is expressed 
in infant brain tissue, endometrial tumor tissue and fontal cortex tissue. ESTs matching this gene 
are also found in lung tissue, germ cell tumors and skin melanomas. This is similar to the 
expression pattern of the murine cnil gene, which is found in 6.5-day whole embryos, 1 1.5-day limb 
bud, 13.5-day whole embryo, adult lung and brain (Dev. Genes Evol., 1999, 209:120-125, the 
20 disclosure of which is incorporated herein by reference in its entirety). 

Polynucleotides encoding the protein of SEQ ID NO:316 or fragments thereof and 
polypeptides comprising the protein of SEQ ID NO: 3 16 or fragments thereof are useful as reagents 
for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions which include, but are not limited to, endometrial tumor, and 
25 neural and developmental diseases and/ or disorders. Similarly, the protein of SEQ ID NO:3 1 6 or 
fragments thereof and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and reproductive organs, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain tissues or cell 
30 types (e.g., neural, reproductive, cancerous and wounded tissues) or bodily fluids (e.g. lymph, 

serum, plasma, urine, amniotic fluid, synovial fluid and spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in infant brain tissue and adult brain tissue, as well as the homology 
35 to cornichon proteins, indicates that polynucleotides encoding the protein of SEQ ID NO:316 or 
fragments thereof and polypeptides comprising the protein of SEQ ID NO:316 or fragments thereof 
are useful for detecting and/or treating neural and developmental disorders. The tissue distribution 
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indicates that these polynucleotides and polypeptides are useful for the detection/treatment of 
neurodegenerative disease states and behavioural disorders such as Alzheimers Disease, Parkinsons 
Disease, Huntingtons Disease, Tourette Syndrome, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder, panic disorder, learning disabilities, ALS, Psychoses, autism, and 
5 altered behaviors, including disorders in feeding, sleep platterns, balance, and perception. In 
addition, the gene or gene product may also play a role in treatment and/or detection of 
developmental disorders associated with the developing embyo, or sexually-linked disorders, 

Elevated expression of the protein of SEQ ID NO:316 within the brain suggests that it may 
be involved in neuronal survival, synapse formation, conductance, neural differentiation, etc. Such 

10 involvment may impact many processes, such as learing and cognition. Alternatively, the tissue 
distribution in endometiral tumor tissue, germ cell tumors and skin melanomas indicates that the 
translation product of this gene is useful for the detection and/or treatment of endometrial tumors 
and/or reproductive disorders, as well as tumors of other tissues where expression of this gene has 
been observed. Furthermore, the protein of SEQ ID NO:316 or fragments thereof may also be used 

15 to determine biological activity, to raise antibodies, as a tissue marker, to isolate cognate ligands or 
receptors, to identify agents that modulate their interactions, in addition to its use as a nutritional 
supplement. The protein of SEQ ED NO:316 or fragments thereof, as well as, antibodies directed 
against the protein may be used as tumor marker and/or immunotherapy targets for the above listed 
tissues. 

20 The gene encoding the protein of SEQ ED NO:3 1 6 is thought to reside on chromosome 1 1 . 

Accordingly, polynucleotides encoding the protein of SEQ ED NO:316 or fragments thereof are 
useful as a marker in linkage analysis for chromosome 11. 

Accordingly, the present invention includes the use of the protein of SEQ ED NO:316 , 
fragments comprising at least 5, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, or 200 

25 consecutive amino acids thereof, or fragments having a desired biological activity to treat or 
ameliorate a condition in an individual. For example, the condition may be an abnormality in 
development, a signaling pathway, microtubule construction, neuronal survival, synapse formation, 
conductance, neuarl differentiation, or it may be cancer or an abnormality in any of the functions i 
listed above. In such embodiments, the protein of SEQ ED NO:316, or a fragment thereof, is 1 

30 administered to an individual in whom it is desired to increase or decrease any of the activities of 
the protein of SEQ ID NO:3 1 6. The protein of SEQ ID NO:3 1 6 or fragment thereof may be 
administered directly to the individual or, alternatively, a nucleic acid encoding the protein of SEQ 
ED NO:316 or a fragment thereof may be administered to the individual. Alternatively, an agent 
which increases the activity of the protein of SEQ ID NO:316 may be administered to the 

35 individual. Such agents may be identified by contacting the protein of SEQ ED NO:3 16 or a cell or 
preparation containing the protein of SEQ ED NO:316 with a test agent and assaying whether the 

345 

BNSDOCID: <WO 01 42451 A2_l_> 



WO 01/42451 PCT/IB00/01938 
test agent increases the activity of the protein. For example, the test agent may be a chemical 
compound or a polypeptide or peptide. 

Alternatively, the activity of the protein of SEQ ED NO:316 may be decreased by 
administering an agent which interferes with such activity to an individual. Agents which interfere 
5 with the activity of the protein of SEQ ID NO:3 16 may be identified by contacting the protein of 
SEQ ID NO:3 1 6 or a cell or preparation containing the protein of SEQ ID NO:3 1 6 with a test agent 
and assaying whether the test agent decreases the activity of the protein. For example, the agent 
may be a chemical compound, a polypeptide or peptide, an antibody, or a nucleic acid such as an 
antisense nucleic acid or a triple helix-forming nucleic acid. 

10 In one embodiment, the invention relates to methods and compositions using the protein of 

the invention or part thereof as a marker protein to selectively identify the source of a sample as, for 
example, brain, or to distinguish between two or more possible sources of a sample on the basis of 
the level of the protein of SEQ ID NO:3 1 6 in the sample. For example, the protein of SEQ ID 
NO:316 or fragments thereof may be used to generate antibodies using any techniques known to 

15 those skilled in the art, including those described therein. Such antibodies may then be used to 
identify tissues of unknown origin, for example, forensic samples, differentiated tumor tissue that 
has metastasized to foreign bodily sites, or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. In such methods a sample is contacted with the antibody, which 
may be detectably labeled, under conditions which facilitate antibody binding. The level of 

20 antibody binding to the test sample is measured and compared to the level of binding to control cells 
from brain or tissues other than brain to determine whether the test sample is from brain. 
Alternatively, the tevel of the protein of SEQ ID NO:316 in a test sample may be measured by 
determining the level of RNA encoding the protein of SEQ ID NO:316 in the test sample. RNA 
levels may be measured using nucleic acid arrays or using techniques such as in situ hybridization, 

25 Northern blots, dot blots or other technques familiar to those skilled in the art. If desired, an 

amplification reaction, such as a PCR reaction, may be performed on the nucleic acid sample prior 
to analysis. The level of RNA in the test sample is compared to RNA levels in control cells from 
brain or tissues other than brain to determine whether the test sample is from brain. 

In another embodiment, antibodies to the protein of the invention or part thereof may be 

30 used for detection, enrichment, or purification of cells expressing the protein of SEQ ID NO:316, 
including using methods known to those skilled in the art. For example, an antibody against the 
protein of SEQ ID NO:316 or a fragment thereof may be fixed to a solid support, such as a 
chromatograpy matrix. A preparation containing cells expressing the protein of SEQ ID NO: 3 16 is 
placed in contact with the antibody under conditions which facilitate binding to the antibody. The 

35 support is washed and then the cells are released from the support by contacting the support with 
agents which cause the cells to dissociate from the antibody. 
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In another embodiment of the present invention, the protein of SEQ ID NO:3 16 or a 
fragment thereof may be used to diagnose disorders associated with altered expression of the 
protein of SEQ ID NO:316. In such techniques, the level of the protein of SEQ ID NO:316 in an ill 
individual is measured using techniques such as those described herein. The level of the protein of 
5 SEQ ID NO:316 in the ill individual is compared to the level in normal individuals to determine 
whether the individual has a level of the protein of SEQ ID NO:316 which is associated with 
disease. 

Protein of SEP ID NO:255 (1 06-037- l-0-E9-CS.cor) 

The protein of SEQ ID NO:255, encoded by the cDNA of SEQ ID NO: 14, is strongly 

10 expressed in the liver and testis and shows extensive homology to human lactate dehydrogenase-A 
protein (LDH-A or M chain) (Chung F.Z. et al., Biochem. J. 231:537-541(1985); SwissProt 
accession number P00338). The protein of SEQ ID NO:255 is also homologous to lactate 
dehydrogenase A from many vertebrates. The 381-amino-acid-long protein of SEQ ID NO:255 
displays a Prosite motif corresponding to lactate dehydrogenase from positions 71 to 380. In 

15 addition, the active site LGEHGDS, where H is the active site residue, is present in the protein of 
the invention (positions 239 to 245). The protein of the invention also contains an additional 50 N- 
terminal amino acids not found in other lactate dehydrogenase A proteins. This N-terrnimal 
extension contains a signal peptide (cleavage site at position 34 of the protein of invention) that may 
allow the export of the protein to the extracellular domain or define a particular subcellular 

20 localization. Alternatively, the initiation start codon could be at position 26 or 50 of the protein of 
SEQIDNO:255. 

Lactate dehydrogenase (LDH) is an enzyme which dehydrogenates lactic acid into 
pyruvic acid in conjunction with the hydrogen acceptor NAD+, and which exists in a wide 
variety of animal tissues and microorganisms as an enzyme serving to produce lactic acid 

25 from pyruvic acid in the glycolytic pathway (Abad-Zapatero C. et al. J. Mol. Biol. 198:445- 
467(1987)). It is known that in vertebrates there are three isozymes of LDH: the M form 
(LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart 
muscle, and the X form (LDH-C), found only in the spermatozoa of mammals and birds. 
In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as 

30 epsilon-crystallin (Hendriks W. et al. Proc. Natl. Acad. Sci. U.S.A. 85:7114-71 18(1988)). 

LDH has been used extensively in the field of clinical test reagents for a number of 
purposes. For example, it has been used as a coupling enzyme to determine the enzymatic 
activity of various amino-transferases, such as alanine aminotransferase (ALT), which is 
ultimately detected by UV spectrometry of the produced pyruvic acid. This use of LDH 

35 has been widely adopted as a clinical test, because amino-transferases are enzymes which 
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show high activity in liver, heart, kidney, etc. and show remarkable increases in serum in 
association with various diseases. LDH has also been used as a coupling enzyme to help 
determine the level of substrates such as urea, as the enzyme promotes the conversion of 
such substances into pyruvic acid which can be detected by UV spectrometry. 
5 Lactate dehydrogenase is also a widely used marker for heart disease and other 

conditions. For example, levels of LD-1 are elevated in the presence of myocardial 
infarction and in other conditions such as leukemia. Levels of lactate dehydrogenase start 
to increase 24 to 48 hours after occlusion of the coronary artery, peak in 3 to 6 days, and 
return to normal in 8 to 14 days. In addition, levels of LD-1 are elevated 10 to 12 hours 

10 after the acute myocardial infarction, peak in 2 to 3 days, and return to normal in 

approximately 7 to 10 days. Thus, measurement of the level of lactate dehydrogenase 
allows a prolonged retrospective diagnosis of myocardial infarction. Further, while the 
amount of LD-2 in the blood is usually higher than the amount of LD-1, patients with acute 
myocardial infarction have more LD-1 than LD-2. This "flipped ratio" usually returns to 

15 normal in 7 to 10 days. An elevated level of LD-1 with a flipped ratio has a sensitivity and 
specificity of approximately 75% to 90% for detection of acute myocardial infarction. 

Elevated LDH levels have also been used as a prognostic indicator for cancers such 
as small cell lung carcinoma. Specifically, elevated levels of LDH indicate a poor 
prognosis for such diseases (Kawahara, et al., (1997) Jpn J Clin Oncol. 1997 Jun;27(3):158- 

20 65). 

LDH expression in cells has also been shown to be induced by interleukin-1 alpha, a 
major cytokine associated with, e.g., inflammation (Nehar et al. (1998) Biol Reprod 
Dec;59(6): 1425-32). 

Islet beta-cells express low levels of lactate dehydrogenase and have high glycerol 
25 phosphate dehydrogenase activity. The effects on glucose metabolism and insulin secretion 
of acute overexpression of the skeletal muscle isoform of lactate dehydrogenase (LDH)-A 
in these cells have been studied by Ainscow EK et al. (Diabetes 2000 Jul;49(7):l 149). The 
results of these studies have shown that overexpression of LDH activity interferes with 
normal glucose metabolism and insulin secretion in islet beta cells, and it may therefore be 
30 directly responsible for insulin secretory defects in some forms of type 2 diabetes. These 
results also reinforce the view that glucose-derived pyruvate metabolism in the 
mitochondria is critical for glucose-stimulated insulin secretion in beta cells. Other data 
show that an overexpression of lactate dehydrogenase A attenuates glucose-induced insulin 
secretion in stable MIN-6 beta-cell lines, which normally express low levels of L-lactate 
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dehydrogenase (Zhao C, Rutter GA FEBS Lett. 1998 Jul 3;430(3):213-6). Low LDH 
activity thus appears to be important in beta-cell glucose sensing. 

Analysis of the LDH isoenzyme pattern in CSF fluid has also been shown to be helpful in 
the evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. 
5 Cancer. 2000 Apr 1 ;88(7): 1 599-604). 

It is believed that the protein of SEQ ID NO:255 is a lactate dehydrogenase protein, most 
likely of the LDH -A or M subtype. The activity of the present protein can be assessed using any 
standard method for detecting lactate dehydrogenase enzyme activity, including those involving the 
UV detection of pyruvate, a product of LDH-catalyzed enzymatic reactions. 

10 In one embodiment, the polypeptides and polynucleotides of the invention are used to detect 

testis and liver tissue, as well as cells derived from these tissues. For example, nucleic acids and 
proteins of the invention can be labeled isotopically or chemically, using methods known to those 
skilled in the art, and used as probes in northern blots, far-western blots and in situ hybridization 
experiments. An ability to detect specific cell types is useful, e.g. for the determination of the 

15 history of tumor cells, as well as for the identification of cells and tissues for histological studies. 

In another embodiment, the present protein can be used in any of a variety of clinical assays 
involving LDH enzymes. For example, the protein can be used as a coupling enzyme to determine 
the enzymatic activity of various amino-transferases, such as alanine aminotransferase (ALT), as 
detected by UV spectrometry of the produced pyruvic acid. Such assays have significant clinical 

20 utility, as amino-transferases are enzymes which show high activity in liver, heart, kidney, etc. and 
show remarkable increases in serum in association with various diseases. The protein of the 
invention can also be used as a coupling enzyme to help determine the level of substrates such as 
urea, as the enzyme promotes the conversion of such substances into pyruvic acid which can be 
detected by UV spectrometry. 

25 In another embodiment, the present protein can be used to identify ingredients for cosmetic 

formulations. Specifically, enhancers of lactate dehydrogenase can be included in cosmetic 
compositions to stimulate keratinocyte proliferation and collagen synthesis in cutaneous tissues. 
The inhibitors can be combined with other active ingredients such as pyruvic acid, acetic acid, 
acetoacetic acid, beta-hydroxybutyric acid, Krebs cycle pathway metabolites, aliphatic saturated or 

30 unsaturated fatty acids containing from 8 to 26 carbon atoms, omega-hydroxy acids containing from 
22 to 34 carbon atoms, glutamic acid, glutamine, valine, alanine, leucine, and mixtures thereof (see, 
e.g., US Patent 5,853,742, the disclosure of which is hereby incorporated by reference in its 
entirety). 

In another embodiment, the present invention provides methods for treating or preventing 
35 cancer, e.g., by inhibiting lactate dehydrogenase activity in cells, preferably specifically the cancer 
cells, of a patient. The expression or activity of lactate dehydrogenase can be inhibited using any of 
a large number of agents, including, but not limited to, antibodies, antisense molecules, ribozymes, 
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and heterologous molecules that inhibit the expression or activity of the lactate dehydrogenase in 
the cancer cells of the patient. In one embodiment, lactate dehydrogenase that has been obtained 
from a primate, or anti-lactate dehydrogenase antibodies obtained from a mammal as a result of the 
parenteral administration of primate lactate dehydrogenase to said mammal, is parenterally 
5 administered to human cancer patients. Antibodies derived from the protein of the invention or part 
thereof can also be used to inhibit cancer cell development as described in US Patent No. 4,620,972. 

Analysis of the LDH isoenzyme pattern in CSF fluid has been shown to be helpful in the 
evaluation of CNS involvement in patients with hematologic malignancies (Lossos IS, et al. Cancer. 
2000 Apr 1;88(7): 1599-604). Thus, in another embodiment, the protein of SEQ ID NO:255 can be 

10 used to develop assays to monitor the LDH isoenzyme activity in CSF fluid, thereby improving the 
sensitivity of CSF cytology. This assay may be derived, e.g., from the methods described by Short 
S. et al. (J Biol Chem. 2000 Apr 28;275(1 7): 12963-9). 

In another embodiment, the protein of SEQ ID NO:255 is used to detect and/or treat insulin 
secretory defects in some forms of type 2 diabetes. For example, various evidence indicates that 

15 LDH overexpression may be involved in certain types of diabetes. Therefore, the detection of an 
elevated level of LDH in a patient, e.g. in pancreatic islet cells of a patient, can be used as an 
indication that the patient has diabetes, or is at risk of developing diabetes. Similarly, methods of 
inhibiting the expression or activity of LDH in those cells, e.g. using antibodies, antisense 
sequences, or heterologous compounds that inhibit the expression or activity of LDH, can be used to 

20 treat or prevent diabetes. 

In another embodiment, the protein of the invention can be used to eliminate endogenous 
pyruvic acid in cells in vitro or in vivo. 

In another embodiment, the expression of the present protein is used as a marker for 
interleukin 1, e.g. IL-1 alpha, activity in cells or in a patient. Specifically, as it has been shown that 

25 LDH expression is induced by IL-1 alpha, then the expression, or elevated expression, of the 
present protein can be used as a marker for the action of IL-1 on the cell. As IL-1 has been 
implicated in a number of physiological processes, including inflammation and more specifically in 
deleterious processes such as arthritis and autoimmune disorders, the present protein can serve as a 
marker for the presence of such disorders, or for a predisposition for the disorders. 

30 In another embodiment, the present protein is used to detect heart disease and other 

diseases in patients. For example, levels of LDH are known to rise following myocardial 
infarction and other heart ailments. Accordingly, the detection of an elevated level of the 
protein of the invention, alone or in view of the levels of other proteins such as other LDH 
isozymes, can be used as an indicator of a heart attack or other diseases, including 

35 leukemia. The levels of LDH can be assessed in any tissue or biological sample, including, 

but not limited to, serum, and can be detecting using any standard method, including, but 

not limited to, immunoassays and assays for LDH enzyme activity. 
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In another embodiment, the present protein is used to determine a prognosis for any 
of a number of diseases, including cancers such as small cell lung carcinoma. For example, 
the level of the present protein is detected in the serum of a patient suffering from cancer, 
wherein the detection of a decreased level of expression or activity of the protein indicates a 
5 worse prognosis for the patient compared to the prognosis in a patient with a normal level 
of the protein activity or expression. 

Proteins of SEP ID NOs:243, 253 (internal designation numbers 105-01 6- 1-0-D3-CS and 105-095- 
2-0-GU-CS^ 

The 331-amino-acid- long protein of SEQ ID NO:243, encoded by the cDNA of SEQ ID 

10 NO:2, is found in prostate and in fetal brain and is homologous to a secreted human protein (Genseq 
accession number Y59685). In addition, this protein is highly homologous to the the putative 
glycerophosphodiester phosphodiesterase (GP-PDE) MIR 16 (Membrane Interacting protein of 
RGS16) protein (SPTREMBLNEW SPTREMBL SWISSPROT accession number AAF65234) 
encoded by the cDNA of GENPEPT GENPEPTNEW accession number AF2 12862; in fact, the 

1 5 protein of the invention is a likely variant of the MIR 16 protein. Furthermore, a BLAST search 
with the amino acid sequence of SEQ ID NO:243 indicates that the protein of the invention is 
homologous to GP-PDEs of E.coli (SWISSPROT accession numbers P09394 and PI 0908) and 
Haemophilus influenzae (SWISSPROT accession number Q06282). The protein of SEQ ID 
NO:243 displays 2 candidate membrane-spanning segments, from amino acids 7 to 27 and 258 to 

20 278, and a putative signal peptide from amino acids 19 to 24. Finally, the protein of the invention 
has two putative 7V-glycosylation sites: asparagine residues at positions 168 and 198 (Zheng et aL 9 
Proc. Natl. Acad. Sci. 97 :3999-4004 (2000)). 

The cDNA of SEQ ID NO:2 differs from the cDNA of GENPEPT GENPEPTNEW 
accession number AF2 12862 by its extended 5' and 3' termini, and from the cDNA of SEQ ID 

25 NO: 1 2 by polymorphisms and alternate splicings. 

The MIR 16 (Membrane Interacting protein of RGS16) protein, which is homologous to the 
protein of the invention, was identified in a yeast two-hybrid screen of a pituitary cell cDNA library 
using the RGS16 (Regulator of G protein Signaling) protein as bait (Zheng et aL 9 Proc. Natl. Acad. 
Sci. 97:3999-4004 (1999)). and Sasaki, J. Bacteriol. 175:4569-4571 (1993); Zheng et al. y ibid.). 

30 Remarkably, the GP-PDE from Haemophilus influenzae (also called protein D) which is 67% 

identical to the periplasmic GP-PDE of E.coli, presents affinity for human immunoglobin D (Janson 
et al., Infect. Immun. 62:4848-854 (1994)). 

From sequence alignments, it can be seen that the N-terminal region of MIR 16 (amino 
acids 70-150), immediately after the putative signal peptide, is highly conserved (40-61% 

35 similarity), suggesting that it may contain residues critical for catalytic activity, i.e., the catalytic 
site. GP-PDEs hydrolyze deacetylated phospholipid GPs, such as glycerophosphocholine (GPC) 
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and glycerophosphoethanolamine, to sn-glycerol-3 -phosphate (G3P) and the corresponding alcohols 
(Zheng et al., ibid.). The putative enzymatic activity of MIR 16 and its interaction with RGS16 
suggest that it may play important roles in lipid metabolism and in G protein signaling. As shown 
in northern blot experiments, the MIR16 mRNA is highly transcribed in heart, liver, kidney, testis 
5 and brain. The observed expression of MIR16 in the brain is consistent with the above-described 
expression of the protein of the invention in the fetal brain. 

It is believed that the proteins of SEQ ID NOs:243 and 253 or part thereof are members of 
the glycerophosphodiester phosphodiesterase protein family, interact with the RGS 16 protein and, 
as such, play important roles in both lipid metabolism and in G protein signaling. Preferred 
10 polypeptides of the invention are polypeptides comprising the amino acids of SEQ ID NO:243 from 
positions 7 to 27, 19 to 24 and 258 to 278. Other preferred polypeptides of the invention are 
fragments of SEQ ID NO:243 or 253 having any of the biological activities described herein. 
Additional preferred polypeptides are those that comprise asparagine residues at positions 168 
and/or 198. 

1 5 The invention first relates to methods and compositions using cDNAs of SEQ ID NO:2 or 

12 or part thereof, and proteins of the invention SEQ ID NO:243 or 253 or part thereof to identify 
specific cell types, preferably from prostate or fetal brain. For example, nucleic acids and proteins 
of the invention are labeled isotopically or chemically following methods known to those skilled in 
the art, and further used as probes in northern blots, far-western blots and in situ hybridization 

20 detection experiments. An ability to detect specific cell types is useful, e.g. for the determination of 
the history of tumor cells, as well as for the identification of cells and tissues for histological 
studies. 

Any of a number of in vitro assays can be used to detect SEQ ID NO:243 or 253 protein 
activity, for example for in vitro screening of modulators of protein activity. Preferably cDNA 

25 encoding the protein of the invention is cloned in a prokaryotic expression vector, according to 

methods known to those skilled in the art. Briefly, the GP-PDE activity of the recombinant protein 
is analyzed by a coupled spectrophotometry assay as described by Larson and collaborators and 
adapted by Cameron and collaborators (Larson et al., J. Biol. Chem. 258 :5426-5432 (1983); 
Cameron et al., Infect. Immun. 66 :5763-5770 (1998)). Such enzymatic activity may be measured 

30 in vitro in the presence of modulating drugs. 

Another embodiment of the present invention relates to methods of using the protein of the 
invention or part thereof to purify or specifically bind to human immunoglobin D. Several 
immunoglobin (Ig) binding bacterial cell wall proteins have been isolated and/or cloned during the 
last two decades. The best characterized of these are protein A of Staphylococcus aureus (which 

35 binds to human IgG subclasses 1 , 2 and 4, IgG of several mammalians species, and in some 
instances human Ig of classes A, M, E), and protein G of group G beta-hemolytic streptococci 
(which binds to all human IgG subclasses and which also displays a wider binding spectrum for 
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animal IgG than protein A). IgD binds to neither protein A nor protein G. Consequently, it is of 
great interest to identify new proteins capable of binding IgD, thereby allowing its separation and 
purification. In addition, IgD binding proteins can also be used in immunoprecipitation procedures 
with IgD, as are routinely performed with proteins A and G in the case of IgG. The binding and 
5 purification of IgD using the protein of the invention can be accomplished in any of a number of 
ways, for example by generating a fusion protein or polypeptide in which the protein of the 
invention or part thereof, is combined with another protein by the use of a recombinant DNA 
molecule. The resulting fusion product including the protein of the invention or part thereof is then 
covalently, or by any other means, bound to a protein, carbohydrate or matrix (such as gold, 

10 "Sephadex" particles, polymeric surfaces). Such a complex is very useful for IgDs immobilization 
and consecutive immunoprecipitations in batch. Similar assays for binding of protein D (GP-PDE) 
of Haemophilus influenzae and IgD are described in the US Patent No. 6,025,484. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention, or part thereof, as GP-PDE enzymes to hydrolyze deacylated phospholipids (GPs), 

15 such as glycerophosphocholine (GPC) and glycerophosphoethanolamine, to sn-glycerol-3- 

phosphate (G3P) and the corresponding alcohols. First, this enzymatic activity, which belongs to 
the class of specific phospholipase D, makes the protein of the invention very useful to study 
biological membranes and their phospholipidic components. Moreover, as glycerophospholipids 
are major components of the lipidic bilayer, elimination of their hydrophilic moiety using the GP- 

20 PDE activity of the protein of the invention would likely modify the structure and consequently the 
permeability of eukaryotic cell membranes. Such modifications could improve the transfection 
efficiency of eukaryotic cells, in vitro or in vivo. Typically, in such embodiments the purified 
protein of SEQ ID NOs:243 or 253 is administrated to cells; purified proteins of the invention can 
be obtained in any of a number ways, for example by inserting the cDNA encoding the proteins into 

25 a prokaryotic expression vector using any technique known to those skilled in the art. The 

recombinant protein produced and purified in the prokaryotic system is then added to an in vitro 
culture of eukaryote cells before or during transfection. The recombinant protein of the invention 
can also be used to increase the efficiency of cell transfection in vivo, most notably in the case of 
gene therapy. For example, tumoral masses are very often resistant to transfection, and the protein 

30 of the invention would likely provide an effective way to facilitate the introduction of cytotoxic 
genes (such as pro-apoptotic genes) or antitumoral drugs in solid tumors. 

Still another embodiment of the protein of the invention relates to methods and 
compositions to diagnose, treat, and prevent disorders associated with excess glutamate signaling in 
the brain. As described above, the MER 16 protein interacts physically with the RGS 16 protein 

35 (Regulator of G protein Signaling 16). Receptors of many hormones use heterotrimeric G proteins 
for signal transduction after ligand binding (for a review, see Neer, Cell 80 :249-257 (1995)). 
Among these receptors are metabotropic glutamate receptors (mGluRs). These receptors, which are 
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expressed in the brain, like the protein of the invention, are a novel family of cloned G-protein- 
coupled receptors (Schoepp and Conn, Trends Pharmacol. Sci. 14:13-20 (1993)). Endogenous 
glutamate, by activating the mGluRl receptor (and also NMDA and AMPA receptors), may 
contribute to the brain damage occurring acutely after epilepsy, cerebral ischemia or traumatic brain 
5 injury. It may also contribute to chronic neurodegeneration in such disorders as amyotrophic lateral 
sclerosis and Huntington's chorea (Meldrum, J. Nutr. 130(4S Suppl):1007S-1015S (2000)). 

The invention thus relates to methods and compositions using cDNAs of SEQ ID NO:2 or 
1 2 or part thereof, and proteins of SEQ ID NO:243 or 253 or part thereof, to diagnose, treat, or 
prevent disorders associated with excess glutamate signaling in the brain. Specifically, the level of 

10 activity or expression of the proteins can be correlated with the level of glutamate signaling, or with 
the glutamate-signaling associated brain damage involved in epilepsy, cerebral ischemia, traumatic 
brain damage, ALS, or Huntington's chorea, or with any other G-protein associated physiological 
process or disease or condition. For situations where the level of the expression or activity of the 
protein is positively correlated with such signaling or with the presence of a disease or condition, 

15 the signaling, disease or condition can be detected using any of a number of tools for detecting 
protein expression or activity, including northern blots, far-western blots and in situ hybridization 
experiments, where an elevated level of the protein, protein activity, or nucleic acid of the invention 
indicates the presence of the disease, condition, or signaling process. Further, such diseases or 
conditions can be treated or prevented, or such signaling pathways can be inhibited, using 

20 compounds that inhibit the expression or activity of the protein, such as antibodies, antisense 

molecules, ribozymes, dominant negative forms of the protein, or any heterologous molecule that 
inhibits protein activity or expression. Alternatively, where the expression or activity of the protein 
of the invention is negatively associated with the signaling pathway, disease or condition, a 
detection of a decreased level of expression or activity of the protein can be used to indicate the 

25 presence of the disease, condition, or pathway. Further, in such cases, the disease or condition can 
be treated or prevented, or the pathway be inhibited, using any compound that increases the activity 
or level of the protein, such as nucleic acids encoding the protein, the protein itself, or heterologous 
compounds that cause an increase in the level of protein expression or activity. 

Protein of SEP ID NO:386 (internal designation 105-037-4-O-H12-CS) 
30 The protein of SEQ ID NO:386, encoded by the cDNA of SEQ ID NO: 145, is strongly 

expressed in the fetal brain and uterus. The 207-amino-acid-long protein of SEQ ED NO:386 

displays pfam SPRY domains from positions 85 to 205. 

SPRY domains have been found in a number of proteins involved in multiple cellular and 

developmental processes. For example, the Midline- 1/FXY family of proteins has been shown to 
35 associate with microtubules, and has been implicated in human diseases, such as Opitz Syndrome, a 

congenital disorder characterized by multiple developmental abnormalities (see, e.g., Cainarca, et 
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al., (1999) Hum Mol Genet 8(8): 1387-96). In addition, the cytoplasmic Marenostrin/Pyrin protein 
has been demonstrated to be the cause of Familial Mediterranean fever, an autosomal recessive 
disorder characterized by fever and serositis (Nat Genet 1997 Sep;17(l):25-31). Other SPRY 
proteins include SplA, a serine protease from Staphylococcus aureus, and butyrophilin, a major 
5 milk protein. Another family of proteins known to contain the SPRY domain are the Ryanodine 
receptors (RyRs). 

Ryanodine receptors play an important role in Ca2+ signaling in muscle and non muscle 
cells by releasing Ca2+ from intracellular stores. For example, these receptors are centrally 
important in excitation-contraction (e-c) coupling, which occurs at specialized regions where the 

10 sarcoplasmic reticulum (SR), containing the ryanodin receptors, and the plasma 

membrane/transverse-tubule system form junctions. RyRs are also thought to play some role in 
maintaining the structural integrity of the SRT-tubule junctions. RyR is apparently unable to carry 
out the requisite functions associated with e-c coupling by itself, however, because it forms 
interactions with other macromolecules at the triad junction. For example, two small proteins, 

15 calmodulin and FKBP12, are believed to modulate RyR at the triad junction. 

It is believed that mammalian tissues express three different RyR isoforms, comprising four 
560-kDa (RyR polypeptide) and four 12-kDa (FK506 binding protein) subunits. It is believed that 
these large protein complexes conduct monovalent and divalent cations and are capable of multiple 
interactions with other molecules. The subunits of the protein complexes include small diffusible 

20 endogenous effector molecules including Ca2+, Mg2+, adenine nucleotides, sufhydryl modifying 
reagents (glutathione, NO, and NO adducts) and lipid intermediates, and proteins such as protein 
kinases and phosphatases, calmodulin, immunophilins (FK506 binding proteins), and in skeletal 
muscle the dihydropyridine receptor. The RyR from skeletal muscle is the major calcium release 
channel for that tissue, and the most intensively studied of the three genetic isoforms detected thus 

25 far in mammalian species. The other two RyR isoforms are often referred to as the 'heart* and 'brain' 
forms, but the actual cell and tissue distribution of the isoforms is complex. 

Because of their multiple ligand interactions, ryanodin receptors constitute an important, 
potentially rich pharmacological target for controlling cellular functions. Ca2+ release channel 
activity is modulated by many endogenous effectors, including Ca2+, ATP, Mg2+, and calmodulin. 

30 In addition, many exogenous effectors, including caffeine, local anesthesics, and polyamines, also 
modify channel activity. For example, tetracaine, procaine, benzocaine, and lidocaine inhibit Ca2+ 
release from the SR. They appear to interact with a specific site(s) located on the RYR, affecting 
both ryanodin-binding and single channel activities (Shoshan-Barmatz et al. 1993; J. Membr. Biol.; 
133; 171-181). 

35 The importance of intracellular calcium as a second messenger in cellular signal 

transduction processes is well established. Alterations in intracellular Ca2+ homeostasis have 
profound effects on many cell functions, including secretion, contraction-relaxation, motility, 

355 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 PCT/1B00/01938 
metabolism, protein synthesis, modification and folding, gene expression, cell-cycle progression 
and apoptosis. A major source of cytoplasmic calcium is from intracellular storehouses located in 
the endoplasmic reticulum, or in muscle, within the sarcoplasmic reticulum (SR). 

Given that cellular Ca2+ handling is an important factor in the control of neuronal 
5 metabolism and electrical activity, abnormalities of intracellular Ca2+ channels might be expected 
to contribute to some forms of epilepsy or to anoxic brain damage following an episode of cerebral 
ischemia. Cell loss is said to be a characteristic feature of degenerative brain disorders, including 
Alzheimer's disease. It is well established that neuronal cell death may be secondary to an 
abnormal elevation of cytoplasmic Ca2+, particulary that associated with activation of excitatory 

10 glutamate receptors (e.g., in epilepsy). This strongly suggests that the release of stored Ca2+ 
contributes to nerve cell damage and cell death in various circumstances. 

It is believed that the protein of SEQ ID NO:386 is functionally related to other SPRY- 
containing proteins, such as the ryanodine receptors, Marenostrin/Pyrin, SplA, Midline-l/FXY, and 
butyrophilin. Accordingly, it is thus believed that the present protein is associated with the release 

15 of Ca2+ from intracellular Ca2+-storing organelles, like the endoplasmic reticulum and, in muscle, 
the sarcoplasmic reticulum (SR), as well as being involved in microtubule binding. Preferred 
polypeptides of the invention are any fragments of SEQ ED NO:386 having any of the biological 
activities described herein. 

In one embodiment, the present protein and nucleic acids can be used to specifically detect 

20 cells of the fetal brain and uterus, as the protein is overexpressed in these tissues. For example, the 
protein of the invention or part thereof may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 
identify tissues of unknown origin, such as in forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 

25 section using immunochemistry. The protein can also be used to specifically label microtubules in 
cells. 

In another embodiment, the protein of the invention or part thereof may be used in 
regulating intracellular Ca2+ levels. As alterations in intracellular Ca2+ homeostasis have profound 
effects on many cell functions, including secretion, contraction-relaxation, motility, metabolism, 

30 protein synthesis, modification and folding, gene expression, cell-cycle progression and apoptosis, 
the ability to modulate intracellular Ca2+ levels provides a tool to alter any of these cellular 
functions, in vitro or in vivo. Such an ability has wide utility for a large number of applications, for 
example to manipulate the behavior (e.g. growth rate, secretion, survival, etc.) of cells grown in 
vitro, as well as to treat, prevent, or diagnose any of a number of diseases associated with altered 

35 Ca2-f signaling in vivo. The activity or expression of the protein of the invention can be modulated 
in any of a large number of ways, for example by administering to cells or to a patient the protein 
itself, a polynucleotide encoding the protein, antibodies, antisense sequences, dominant negative 
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forms of the protein, compounds that alter the expression or activity of the protein, etc. The effect 
of any such agent on calcium flux in cells can be detected using standard methods, including by 
studying the permeation of Ca2+ release through endoplasmic reticulum (ER) and sarcoplasmic 
reticulum (SR) channels using tracers, light scattering and fluorescence quenching, and channel 
5 reconstitution in planar bilayer. In addition, targeted recombinant photoproteins can provide direct 
measurements of organellar Ca2+ (Montero et al.; 1995; EMBO J.; 14, 5467-5475). 

The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which the activity or 
recognition of ryanodin receptors, is impaired or excessive. These disorders include, but are not 

10 limited to, neurodegenerative diseases, cardiovascular disorders, severe myasthenia, malignant 
hyperthermia, epilepsy, and central core disease. For example, in patients with severe myasthenia, 
the level of anti-RyR antibodies has been directly related to the severity of the disease (Skeie et al., 
1996: Eur. J. Neurol. 3; 136-140). There is also some evidence to suggest that RyR abnormalities 
are a primary cause of many types of cardiac disease. In addition, the protein of the invention can 

1 5 be used to diagnose other diseases associated with SPRY-protein dysfunction, such as Familial 
Mediterranean fever and Opitz syndrome. Finally, as SPRY containing proteins have been 
implicated in embryonic development (e.g. the Midlinel protein), the protein and nucleic acids of 
the invention can be used to detect developmental disorders, as the detection of a mutation in the 
gene encoding SEQ ID NO:386, or a detection of abnormal gene expression in a fetus, can be used 

20 to indicate the presence of a developmental abnormality. For example, as the protein of SEQ ID 
NO:386 is strongly expressed in the fetal brain, it is likely that the protein plays a role in the normal 
development of the brain in utero. 

The present invention also relates to diagnostic assays for detecting altered levels of the 
protein of SEQ ID NO:386 in various tissues, as over-expression of the protein compared to normal 

25 control tissue samples can indicate the presence of certain disease conditions such as 

neurodegenerative disorders, cardiovascular disorders, svere myasthenia, malignant hyperthermia, 
epilepsy, and central core disease. Assays used to detect levels of the polypeptide of the present 
invention in a sample derived from a host are well-known to those of skill in the art and include 
radioimmunoassays competitive-binding assays, Western Blot analysis and ELISA assays. 

30 Proteins SEP ID NOs:283 and 286 (internal designations 1 74-38- 1-0B6-CS LA and 174-41-1-0- 
A6-CS LA) 

The protein of SEQ ID NO:283, encoded by the cDNA of SEQ ED NO:42, is overexpressed 
in salivary glands and to a lesser extent in bone marrow, and shows homology over the C-terminal 
length to the irnmunoglobin (Ig) protein superfamily, which is conserved among eukaryotes 
35 (including rabbit, rodents and human). In particular, the 468-amino-acid-long protein of the 
invention, which is similar in size to the constant chain of Ig related proteins, displays two pfam 
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conserved immunoglobulin domains, from position 205 to 285 and from position 318 to 384, which 
are known to be involved in the basic structure of the light and heavy constant chains of 
immunoglobins. It is known (Orr H.T., Nature 282:266-270(1979)) that the Ig constant chain 
domains and a single extracellular domain in each type of MHC chain are closely related, sharing 
5 over one hundred amino-acids of homology. All members of the Ig related superfamily, including 
the MHC class I alpha chain and beta-2-microglobulin, as well as the MHC class II alpha and beta 
chains, display the prosite conserved characteristic pattern around the C-terminal cysteine ([FY]-x- 
C-x-[VA]-x-H). This cysteine is involved in the disulfide bond between the light and heavy chains, 
and is also found in the protein of the invention (position 380 to 386). The protein of the invention 

10 also exhibits an emotif Ig and Major Histocompatibility Complex protein signature from positions 
319 to 336. In addition, the protein of the invention displays homology with tapasin (GeneBank 
No. AP009510), a chaperone-like protein closely associated with TAP-binding proteins, which is 
well conserved among eukaryotes (chicken, rodents and human). Tapasin has been shown to 
increase the efficiency of antigen processing and presentation by mediating the association of MHC 

15 complex proteins with TAP proteins to the endoplasmic reticulum and to the cell surface during 
immune response (for review see Abele, R. and Tampe, R., Bioch. et Biophysica Acta, 1999). In 
addition, the protein of the invention displays two transmembrane domains from positions 199 to 
219 and from positions 406 to 426 , a hydrophobic profile similar in amino acid position to the 
hydrophobic stretch of amino acids of human and mouse tapasin (Suling L., J. Biol. Chem., 

20 274:8649-8654, 1999), and a secreted signal peptide from position 9 to 23. Both signatures are 
largely present in Ig related proteins such as secreted antibodies or antigen presenting proteins. The 
invention also encompasses a variant (SEQ ID NO:286) of SEQ ID NO: 2 83, encoded by the cDNA 
of SEQ ID NO:45. The protein of SEQ ED No:286 is a 442-amino-acid-long protein with a C- 
terminal shorter end of 26 amino-acids compared to the protein of SEQ ID NO:283. The variant of 

25 SEQ ID NO:286, which results from a frameshift (position 1445 in SEQ ID NO:45) in the coding 
sequence that leads to a stop codon in the corresponding protein, displays characteristics identical to 
those described above in terms of motifs, Ig signatures, function, and potential uses. 

The immunoglobulin (Ig) gene superfamily comprises a large number of cell surface 
glycoproteins that share sequence homology with the V and C domains of antibody heavy and light 

30 chains. These molecules function as receptors for antigens, immunoglobulins and cytokines as well 
as adhesion molecules, and play important roles in regulating the complex cell interactions that 
occur within the immune system (A. F. Williams et al., Annu. Rev. Immuno. 6:381-405, 1988, T. 
Hunkapiller et al., Adv. Immunol. 44:1-63, 1989; for a short review see also Prosite entry PS00290) 
The introduction of an antigen into a host initiates a series of events culminating in an 

35 immune response. In addition, self-antigens can result in immunological tolerance or activation of 
an immune response against self-antigens. A major portion of the immune response is regulated by 
presentation of antigen by major histocompatibility complex molecules. MHC molecules bind to 
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peptide fragments derived from antigens to form complexes that are recognized by T cell receptors \ 
on the surface of T cells, giving rise to the phenomenon of MHC-restricted T cell recognition. The 
ability of a host to react to a given antigen (responsiveness) is influenced by the spectrum of MHC . 
molecules expressed by the host. Responsiveness correlates with the ability of specific peptide 
5 fragments to bind to particular MHC molecules. 

There are two types of MHC molecules, class 1 and class II, each of which comrise two 
chains. In class 1 [2], the alpha chain is composed of three extracellular domains, a transmembrane 
region, and a cytoplasmic tail. The beta chain (beta-2 -microglobulin) is composed of a single 
extracellular domain. In class II [3], both the alpha and the beta chains are composed of two 

10 extracellular domains, a transmembrane region and a cytoplasmic tail. MHC class I molecules are 
expressed on the surface of all cells, and MHC class II molecules are expressed on the surface of 
antigen presenting cells. MHC class II molecules bind to peptides derived from proteins made 
outside of an antigen presenting cell. In contrast, MHC class I molecules bind to peptides derived 
from proteins made inside a cell. In order to present peptide in the context of a class II molecule, an 

15 antigen presenting cell phagocytoses an antigen into an intracellular vesicle, in which the antigen is 
cleaved, bound to an MHC class II molecule, and then returned to the surface of the antigen 
presenting cell. 

Major histocompatibility complex (MHC) class I molecules present antigenic peptides to 
CD8 T cells (Townsend, A. et al., Nature:340,443-448)). The peptides are generated in the cytosol 

20 and then translocated across the membrane of the endoplasmic reticulum by the transporter 

associated with antigen processing (TAP). TAP is a trimeric complex consisting of TAP 1, TAP2, 
and tapasin (TAP -A). TAP1 and TAP2 are required for the peptide transport. Tapasin mediates the 
interaction of MHC class I HC-beta-2 microglobulin with TAP, and this interaction is essential for 
peptide loading onto MHC class I HC-beta-2 -microglobulin (Suling et al., J. Biol. Chem., 

25 274:8649-8654). T cell receptors (TCRs) are the second antigen recognition molecules, and 
recognize antigens that are bound by MHC molecules. Recognition of MHC complexed with 
peptide (MHC-peptide complex) by TCR can effect the activity of the T cell bearing the TCR. 
Thus, MHC-peptide complexes are important in the regulation of T cell activity and, thus, in 
regulating an immune response. 

30 Human cytomegalovirus (HCMV) is a bet aherpes virus which causes clinically serious 

disease in immunocompromised and immunosuppressed adults, as well as in some infants infected 
in utero or perinatally (Alford, C. A., and W. J. Britt. 1990. Cytomegalovirus, p. 1981-2010. In D. 
M. Knipe and B. N. Fields (ed.), Virology, 2nd ed. Raven press, New York). In human 
cytomegalovirus (HCMV)-infected cells, expression of the cellular major histocompatibility 

35 complex (MHC) class I heavy chains is down-regulated, where down-regulation is defined as 

reduction in either synthesis, stability or surface expression of MHC class I heavy chains. A similar 
phenomenon has been reported for some other DNA viruses, including adenovirus, murine 
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cytomegalovirus, and herpes simplex virus (Anderson, M., et al., Cell 43:215-222, 1985; Burgert 
andKvist, Cell 41:987-997, 1985; Heise T. M., et al., J. Exp. Med. 187:1037-1046, 1998). In the 
adenovirus and herpes simplex virus systems, the product of a viral gene which is dispensable for 
replication in vitro is sufficient to cause down-regulation of MHC class I heavy chains (Anderson, 
5 M., et al., 1985, supra). The gene(s) involved in class I heavy chain down-regulation by murine 
cytomegalovirus have not yet been identified. 

It is believed that the proteins of SEQ ED NOs:283 and 286 are members of the 
immunoglobulin superfamily and, as such, play a role in the immune response, cellular proteolysis, 
cell proliferation and differentiation, pathogen recognition, apoptosis, and other processes 

10 associated with the Ig superfamily. In addition, the proteins of the invention are thought to be 
tightly linked to the antigen processing and presentation system in the context of peptide assembly 
and translocation of foreign peptides across endoplasmic reticulum and cell surface membranes as 
new chaperonin-like proteins associated with MHC I and TAP proteins. The weak homology (30%) 
with the TAP protein family is thought to indicate the specificity of the interactions of the proteins 

1 5 of the invention with MHC proteins and/or TAP-related proteins, as described by Suling et al., 
supra. 

Preferred polypeptides of the invention are polypeptides comprising the amino acids of 
SEQ ID NO:283 from position 9 to 23, 199 to 219, 205 to 285, 318 to 384, 319 to 336, 380 to 386 
and from 406 to 426. Other preferred polypeptides of the invention are fragments of SEQ ID 

20 NO:283 having any of the biological activities described herein. 

In one embodiment, the invention relates to methods and compositions for using the protein 
of the invention or part thereof as a marker protein to selectively identify tissues, such as salivary 
glands and bone marrow tissues, which strongly express the protein of the invention. For example, 
the protein of the invention or part thereof may be used to synthesize specific antibodies using any 

25 techniques known to those skilled in the art including those described therein. Such tissue-specific 
antibodies may then be used to identify tissues of unknown origin, for example, forensic samples, 
differentiated tumor tissue that has metastasized to foreign bodily sites, or to differentiate different 
tissue types in a tissue cross-section using immunochemistry. 

In another embodiment, the invention relates to methods for using the protein of the 

30 invention to visualize proteins and peptides involved in antigen recognition system within cells by 
virtue of their physical interaction with the proteins of the invention. For example, the protein may 
be used to detect the presence and/or the localization of MHC peptides and TAP- like proteins in a 
cell. The protein of the invention, and hence any interacting proteins, can be labeled using any of a 
number of methods, including by binding with specific antibodies or by creating a fusion protein 

35 comprising the protein of the invention as well as a readily detectable moiety, such as an epitope 
tag, biotin, or green fluorescent protein. 
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In another embodiment, polynucleotide or polypeptide sequences of the invention or part 
thereof may be used for the diagnosis of a disorder associated with a loss of regulation of the 
expression of the protein of the invention, preferably, but not limited to, deficiencies of the MHC 
protein system. Examples of such disorders include, but are not limited to, acquired 
5 immunodeficiency syndrome (AIDS), X-linked agammaglobinemia of Bruton, common variable 
immunodeficiency (CVI), DiGeorge's syndrome (thymic hypoplasia), thymic dysplasia, isolated 
IgA deficiency, severe combined immunodeficiency disease (SCID), immunodeficiency with 
thrombocytopenia and eczema (Wiskott-Aldrich syndrome), Chediak-Higashi syndrome, chronic 
granulomatous diseases, hereditary angioneurotic edema, immunodeficiency associated with 

10 Cushing's disease, Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, 
autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic 
dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with 
lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 

15 glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 

hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 
syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 

20 Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
such as multiple myeloma, and lymphomas such as Hodgkin's disease; a cell proliferative disorder 
such as arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease 
(MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, 
primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, 

25 melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, 
bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, 
kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; and an infection, such as infections by viral agents 
classified as adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, 

30 herpesvirus, flavivirus, orthomyxovirus, parvovirus, papovavirus, paramyxovirus, picornavirus, 

poxvirus, reovirus, retrovirus, rhabdovirus, and togavirus; infections by bacterial agents classified as 
pneumococcus, staphylococcus, streptococcus, bacillus, corynebacterium, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemophilus, legionella, bordetella, 
gram-negative enterobacterium including shigella, salmonella, and Campylobacter, pseudomonas, 

35 vibrio, brucella, francisella, yersinia, bartonella, norcardium, actinomyces, mycobacterium, 
spirochaetale, rickettsia, chlamydia, and mycoplasma; infections by fungal agents classified as 
aspergillus, blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, and 
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other fungal agents causing various mycoses; and infections by parasites classified as Plasmodium 
or malaria-causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis 
carinii, intestinal protozoa such as giardia, trichomonas, tissue nematodes such as trichinella, 
intestinal nematodes such as ascaris, lymphatic filarial nematodes, trematodes such as schistosoma, 
5 and cestrodes such as tapeworm. To assess abnormal expression of the present protein associated 
with any of these disorders, the level of the present polynucleotides or polypeptides can be detected 
in a biological sample or cell using any standard method, including Southern or northern analysis, 
dot blots, other membrane-based technologies, PCR technologies, dipstick, pin, ELISA assays, and 
in microarrays. Any of these methods may be used for the diagnosis of disorders characterized by 

10 an alteration of expression of SEQ ID NO:283 or 286, such as the disorders mentioned above, or in 
assays to monitor patients being treated with SEQ ID NO:283 or 286 or agonists, antagonists, or 
inhibitors of SEQ ID NO:283 or 286. Antibodies useful for diagnostic purposes may be prepared, 
e.g., in the same manner as that described in U.S. Patent No. 6,135,941. Diagnostic assays for SEQ 
ID NO:283 or 286 include methods which utilize the antibody and a label to detect SEQ ID NO: 

15 283 or 286 in human body fluids or in extracts of cells or tissues. The antibodies may be used with 
or without modification, and may be labeled by covalent or non-covalent attachment of a reporter 
molecule. A wide variety of reporter molecules, several of which are described above, are known in 
the art and may be used. 

In another embodiment, the protein of SEQ ID NO:283 or 286 or a fragment or derivative 

20 thereof may be administered to a subject to diagnose, treat or prevent an immune disorder 

associated with decreased expression or activity of the protein of the invention. Such disorders can 
include, but are not limited to, acquired immunodeficiency syndrome (AIDS), X-linked 
agammaglobinemia of Bruton, common variable immunodeficiency (CVI), DiGeorge's syndrome 
(thymic hypoplasia), thymic dysplasia, isolated IgA deficiency, severe combined immunodeficiency 

25 disease (SCID), immunodeficiency with thrombocytopenia and eczema (Wiskott-Aldrich 

syndrome), Chediak-Higashi syndrome, chronic granulomatous diseases, hereditary angioneurotic 
edema, immunodeficiency associated with Cushing's disease, Addison's disease, adult respiratory 
distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, cholecystitis, contact dermatitis, 

30 Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic 

lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, 
glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, 
hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or 
pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's 

35 syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic 
lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Wemer syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, leukemias 
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such as multiple myeloma, and lymphomas such as Hodgkin's disease. In addition, such disorders 
associated with decreased protein expression or activity can be treated by administering to a patient 
polynucleotide sequences encoding the protein of the invention, e.g. inserted in an appropriate 
vector. In another example, a compound that increases either the activity of the protein of the 
5 invention or their expression can be administered to a patient to treat or prevent any of the diseases 
mentioned above. 

In a further embodiment, an antagonist of the protein of the invention may be administered 
to a subject to treat or prevent an immune disorder associated with increased expression or activity 
of the protein of SEQ ID NO:283 or 286 including, but not limited to, auto-immune deseases or 

10 graft rejection. In one aspect, an antibody which specifically binds the protein of the invention may 
be used directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissues which express the proteins of the invention, such as the 
salivary gland tissue or the bone marrow tissue. In addition, sense, antisense nucleotides, GSE, 
ribozymes, specific protein inhibitors such as antibodies or small coumpounds can be administered 

15 to inhibit the expression of the proteins of the invention. 

In another embodiment, an antagonist of the protein of SEQ ID NO:283 may be 
administered to a subject to treat or prevent a cell proliferative disorder. Such disorders may 
include, but are not limited to, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 

20 polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, 
prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. In one aspect, an antibody 

25 which specifically binds the protein of the invention may be used directly as an antagonist or 

indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue 
which express the protein of the invention. In another example, sense, antisense nucleotides, GSE, 
or ribozymes designed from nucleotides of the invention can be administered to inhibit the 
expression of the protein of the invention. 

30 Protein of SEP ID NO: 41 1 (internal designation 1 81-10-1-0-C9-CS) 

The protein of SEQ ID NO: 41 1 encoded by the cDNA of SEQ ID No: 170 is highly 
expressed in fetal liver. The protein of the invention is homologous to peripheral benzodiazepine 
receptor/isoquinoline binding protein (PBR/IBP) of human, bovine and murine origin (Genbank 
accession numbers M36035, M64520 and LI 7306 respectively). The 170-amino-acid protein of 

35 SEQ ED NO: 41 1 is similar in size and hydropathicity to known peripheral PBR/IBP 

benzodiazepine receptors/isoquinoline binding proteins. Like the known peripheral benzodiazepine 



363 



WO 01/42451 PCT/IB00/01938 
receptors/isoquinoline binding proteins, the protein of the subject invention has about five potential 
transmembrane domains at positions 3-23, 45-65, 82-102, 105-125 and 130-150. Moreover, the 
protein of the invention displays a stretch of 1 1 amino acids (starting with VI 44 and ending with 
R154) that corresponds to a recently identified putative cholesterol recognition/interaction amino 
5 acid consensus pattern (-LA^-(X)(l-5)-Y-(X)(l-5)-R/K-) [See Li et al, Endocrinology 1998 Dec; 
139(12): 4991-7]. 

The peripheral benzodiazepine receptor (PBR) is a 1 8-kDa protein containing binding sites 
for benzodiazepine and is distinct from the GABA neurotransmitter receptor [Papadopoulos, V. 
(1993) Endocr. Rev. 14: 222-240]. Expression of PBR has been found in every tissue examined. 

10 However, it is most abundant in steroidogenic cells and is also found, primarily, on outer 

mitochondrial membranes [Anholt, R et al (1986) J. Biol Chem. 261 :576-583]. PBR is thought to 
be associated with a multimeric complex composed of the 1 8-kDa isoquinoline binding protein and 
the 34-kDa pore-forming voltage dependent anion channel protein, preferentially located on the 
outer/inner mitochondrial membrane contact sites [McEnery, M.W. et al Proc. Natl. Acad. Sci. 

15 USA. 89:3170-3174; Gamier, M. et al. (1994) Mol Pharmacol. 45:201-21 1; Papadopoulos, V. et 
al. (1994) Mol Cel. Endocr. 104:R5-R9]. Drug ligands of PBR, upon binding to the receptor, 
simulate steroid synthesis in steroidogenic cells in vitro [Papadopoulos, V et al (1990) J. Biol 
Chem. 265: 3772-3779; Barnea, E. R. et al (1989) Mol Cell Endocr. 64: 155-159; Amsterdam, A. 
and Suh, B.S. (1991) Endocrinology 128: 503-510]. Likewise, in vivo studies showed that high 

20 affinity PBR ligands increase steroid plasma levels in hypophysectomized rats [Amri, H. et al 

(1996) Endocrinology 137:5707-5718]. Further in vitro studies on isolated mitochondria provided 
evidence that PBR ligands, drug ligands, or the endogenous PBR ligand (the polypeptide diazepam- 
binding inhibitor (DBI) [Papadopoulos, V. et al (1997) Steroids 62: 21-28]) stimulate pregnenolone 
formation by increasing the rate of cholesterol transfer from the outer to the inner mitochondrial 

25 membrane [for review, see Culty, M. et al. (1 999) Journal of Steroid Biochemistry and Molecular 
Biology 69: 123-130]. 

Based on the amino acid sequence of the 1 8-kDa PBR, a three dimensional model was 
developed [Papadopoulos, V. (1996) In: The Leydig Cell. Payne, A. H. et al. (eds) Cache River 
Press, IL, pp 596-628]. This model was shown to accommodate a cholesterol molecule and 

30 function as a channel, supporting the role of PBR in cholesterol transport. The role of PBR in 
steroidogenesis was also demonstrated by observing that PBR negative cells generated by 
homologous recombination failed to produce steroids [Papadopoulos, V. et al (1997) J. Biol Chem. 
272: 32129-32135]. Further, cholesterol transport experiments in bacteria expressing the 1 8-kDa 
PBR protein provided definitive evidence for a function as a cholesterol channel/transporter 

35 [Papadopoulos, V. et al. (1997) supra]. 

In addition to its role in mediating cholesterol movement across membranes, PBR has been 
implicated in several other physiological functions, including cell growth and differentiation, 
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chemotaxis, mitochondrial physiology, porphyrin and heme biosynthesis, immune response, anion 
transport and GABAergic regulation of CNS. [for review, see Gavish, M. et aL (1 999) 
Pharmaceutical Reviews 51: 629-650; Beurdeley-Thomas, A. et al. (2000) Journal ofNeuro- 
Oncology 46: 45-56]. Also, a recent report also indicates that PBR agonists are potent anti-apoptotic 
5 compounds. These findings suggest that this effect may represent a major function for this receptor 
(Bono, F. et aL (1999) Biochemical and Biophysical Research Communications 265:457-461]. 

It appears that PBR is associated with stress and anxiety disorders. It has been suggested 
that PBRs play a role in the regulation of several stress systems such as the HPA axis, the 
sympathetic nervous system, the renin-angiotensin axis, and the neuroendocrine axis. In these 

10 systems, acute stress typically leads to increases in PBR density, whereas chronic stress typically 
leads to decreases in PBR density. Furthermore, in Generalized Anxiety Disorder (GAD), Panic 
Disorder (PD), Generalized Social Phobia (GSP), and Post-Traumatic Stress Disorders (PTSD), 
PBR density is typically decreased in platelets. 

In the brain, where PBRs are associated with glial cells, PBRs are increased in specific 

15 brain areas in neurodegenerative disorders and also after neurotoxic and traumatic-ischemic brain 
damage [for review, see Gavish, M. et aL (1999) supra]. The literature also reports a decrease in 
peripheral -type benzodiazepine receptors in postmortems of chronic schizophrenics, suggesting that 
the decreased density of PBRs in the brain may be involved in the pathophysiology of 
schizophrenia. Increased levels of PBR in autopsied brain tissue from PSE patients (Portal- 

20 Systemic Encephalopathy patients) have been reported, thus supporting the theory that activation of 
PBR contributes to the pathogenesis characteristic of portal-systemic encephalopathy (PSE) in the 
central nervous system [Kurumaji, A. et al (1997),/. Neural Transm 104:1361-1370; Butterworth 
R. F. (2000) hi eurochemistry International 36: 41 1-416]. 

In addition to its involvement in the neurological disorders discussed supra, PBR has been 

25 implicated in the regulation of tumor cell proliferation [for review, see Gavish, M. et aL (1999) 
supra; Beurdeley-Thomas, A. et aL (2000) supra; Hardwick, M. (1999) Cancer Research 59:831- 
842; Venturing I. et al. (1998) Life Sci 63:1269-80; Carmel I et al. (1999) Biochem Pharmacol 58: 
273-8], The invasiveness and metastatic ability of human breast tumor cells is proportional to the 
level of PBR expressed. Further, PBR has been proposed to be used as a tool/marker for detection, 

30 diagnosis, prognosis and treatment of cancer [WO 99/49316, hereby incorporated by reference in its 
entirety]. 

Many ligands have been described that bind to peripheral benzodiazepine receptor with 
various affinities. Some benzodiazepines, Ro 5-4864 [4-chlorodiazepam], diazepam and structurally 
related compounds, are potent and selective PBR ligands. Exogenous ligands also include 2- 
35 phenylquinoline carboxamides (PK1 1 195 series), imidazo [l,2-a]pyridine-3-acetamides (Alpidem 
series) and pyridazine derivatives. Some endogenous compounds, including porphyrins and 
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diazepam binding inhibitor (DBI), bind to PBR with nanomolar and micromolar affinity [for 
review, see Gavish, M. et al. (1999) supra; Beurdeley- Thomas, A. et al. (2000) supra]. 

The protein of SEQ ID NO: 411 is a novel peripheral -type benzodiazepine receptor. As 
such, it is serves a channel function that mediates cholesterol movement across membranes, play a 
5 role in steroidogenesis, cell growth and differentiation, chemotaxis, mitochondrial physiology, 
protection against apoptosis, porphyrin and heme biosynthesis, immune response, anion transport 
and GABAergic regulation of CNS. 

In one embodiment, a preferred polypeptide of the invention comprises the amino acids of 
SEQ ID NO: 41 1 from position 144 to 154. In another embodiment, the subject invention provides 
10 a polypeptide comprising the sequence of SEQ ID NO: 411. Other preferred polypeptides of the 
invention include biologically active fragments of SEQ ED NO: 411. Biologically active fragments 
of the protein of SEQ ID NO: 411 have any of the biological activities described herein which are 
associated with the PBR. In another embodiment, the polypeptide of the invention is encoded by 
clone 181-10-1-0-C9-CS. 

1 5 One aspect of the subject invention provides compositions and methods using the protein of 

the invention, or biologically active fragments thereof, for the development, identification, and/or 
selection of agents capable of modulating the expression or activity of the protein of the invention. 

Agents which modulate the activity of the PBR/IBP of the subject invention include, but are 
not limited to, antisense oligonucleotides, ribozymes, drugs, and antibodies. These agents may be 

20 made and used according to methods well known in the art. Also, the protein of the invention, or 
biologically active fragments thereof, may be used in screening assays for therapeutic compounds. 
A variety of drug screening techniques may be employed. In this aspect of the invention, the 
protein or biologically active fragment thereof, may be free in solution, affixed to a solid support, 
recombinantly expressed on, or chemically attached to, a cell surface, or located intracellularly. 

25 The formation of binding complexes, between the protein of the invention, or biologically active 
fragments thereof, and the compound being tested, may then be measured. 

In one embodiment, the subject method utilizes eukaryotic or prokaryotic host cells which 
are stably transformed with recombinant nucleic acids expressing the PBR/IBP polypeptide or 
biologically active fragments thereof. The transformed cells may be viable or fixed. Drugs or 

30 compounds which are candidates for the modulation of the PBR/IBP, or biologically active 
fragments thereof, are screened against such transformed cells in binding assays well known to 
those skilled in the art. Alternatively, assays such as those taught in Geysen H. N., WO Application 
84/03564, published on Sep. 13, 1984, and incorporated herein by reference in its entirety, may be . 
used to screen for peptide compounds which demonstrate binding affinity for, or the ability to 

35 modulate, the PBR/IBP, or biologically active fragments thereof. In another embodiment, 

competitive drug screening assays using neutralizing antibodies specifically compete with a test 

366 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 PCT/IBOO/01938 
compound for binding to the PBR/IBP protein of the invention, or biologically active fragments 
thereof. 

Another embodiment of the subject invention provides compositions and methods of 
selectively modulating the expression or activity of the protein of the invention. Modulation of the 
5 PBR/IBP would allow for the successful treatment and/or management of diseases or biochemical 
abnormalities associated with the PBR or PBR/IBP. Antagonists, able to reduce or inhibit the 
expression or the activity of the protein of the invention, would be useful in the treatment of 
diseases associated with elevated levels of the PBR/IBP, increased cell proliferation, or increased 
cholesterol transport. Thus, the subject invention provides methods for treating a variety of diseases 
10 or disorders, including, but not limited to, cancers, especially liver cancer, and portal-systemic 
encephalopathy. 

Alternatively, the subject invention provides methods of treating diseases or disorders 
associated with decreased levels of the protein of the PBR/IBP. Thus, the subject invention provides 
methods of treating diseases including, and not limited to, schizophrenia, chronic stress, GAD, PD, 
1 5 GSP and PTSD. Other diseases which may be treated by agonists of the PBR/IBP of the subject 
invention include those diseases associated with decreases in cell proliferation, e.g. developmental 
retardation. 

Furthermore, because the PBR/IBP of the subject invention is also able to transport 
cholesterol into cells, the subject invention may also be used to increase cholesterol transport into 

20 cells. Diseases associated with cholesterol transport deficiencies include lipoidal adrenal 

hyperplasia, and diseases where there is a requirement for increased production of compounds 
requiring cholesterol such as myelin and myelination, such as Alzheimer's disease, spinal chord 
injury, and brain development neuropathy [Snipes, G. and Suter, U. (1997) Cholesterol and Myelin. 
In: Subcellular Biochemistry, Robert Bittman (ed.), vol. 28, pp. 173-204, Plenum Press, New York]. 

25 The methods of treating disorders associated with decreased levels of PBR/EBP may be practiced by 
introducing agonists which stimulate the expression or the activity of the protein of the invention. 

In one embodiment, methods of increasing the levels of PBR/IBP in tissues or cell types 
may be practiced by utilizing nucleic acids encoding the protein of the subject invention, or 
biologically active fragments thereof, to introduce biologically active polypeptide into targeted cell 

30 types. Vectors useful in such methods are known to those skilled in the art as are methods of 
introducing such nucleic acids into target tissues. 

Agents which stimulate or inhibit the activity of the protein of the invention include but are 
not limited to agonist and antagonist drugs respectively. These drugs can be obtained using any of a 
variety of drug screening techniques as discussed above. 

35 Antagonists of the PBR/IBP encoded by SEQ ID NO: 170 include agents which decrease 

the levels of expressed mRNA encoding the protein of SEQ ID NO: 411. These include, but are not 
limited to, RNAi, one or more ribozymes capable of digesting the protein of the invention mRNA, 
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or antisense oligonucleotides capable of hybridizing to mRNA encoding the PBR/IBP of SEQ ID 
NO: 411 Antisense oligonucleotides can be administrated as DNA, as DNA entrapped in 
proteoliposomes containing viral envelope receptor proteins [Kanoda, Y. et al. (1989) Science 243: 
375] or as part of a vector which can be expressed in the target cell and provide antisense DNA or 
5 RNA. Vectors which are expressed in particular cell types are known in the art. Alternatively, the 
DNA can be injected along with a carrier. A carrier can be a protein such as a cytokine, for example 
interleukin 2, or polylysine-glycoprotein carriers. Carrier proteins, vectors, and methods of making 
and using polylysine carrier systems are known in the art. Alternatively, nucleic acid encoding 
antisense molecules may be coated onto gold beads and introduced into the skin with, for example, 
• 10 a gene gun [Ulmer, J.B. et al. (1993) Science 259:1745]. 

Antibodies, or other polypeptides, capable of reducing or inhibiting the activity of PBR/IBP 
may be provided as in isolated and substantially purified form. Alternatively, antibodies or other 
polypeptides capable of inhibiting or reducing the activity of the PBR/IBP protein, may be 
recombinantly expressed in the target cell to provide a modulating effect. In addition, compounds 

15 which inhibit or reduce the activity of the PBR/IBP protein of the subject invention may be 

incorporated into biodegradable polymers being implanted in the vicinity of where drug delivery is 
desired. For example, biodegradable polymers may be implanted at the site of a tumor or, 
alternatively, biodegradable polymers containing antagonists/agonists may be implanted to slowly 
release the compounds systemically. Biodegradable polymers, and their use, are known to those of 

20 skill in the art (see, for example, Brem et al. (1 991) J. Neurosurg. 74:441-446. 

In another embodiment, the invention provides methods and compositions for detecting the 
level of expression of the mRNA of the protein of the invention. Quantification of mRNA levels of 
the PBR/IBP protein of the invention may be useful for the diagnosis or prognosis of diseases 
associated with an altered expression of the protein of the invention. Assays for the detection and 

25 quantification of the mRNA of the protein of the invention are well known in the art (see, for 
example, Maniatis, Fitsch and Sambrook, Molecular Cloning; A Laboratory Manual (1982), or 
Current Protocols in Molecular Biology, Ausubel, F.M. et al. (Eds), Wiley & Sons, Inc.). 

Polynucleotides probes or primers for the detection of the mRNA of the protein of SEQ ID 
NO: 41 1 can be designed from the cDNA of SEQ ID NO: 170. Methods for designing probes and 

30 primers are known in the art. In another embodiment, the subject invention provides diagnostic kits 
for the detection of the mRNA of the protein of the invention in cells. The kit comprises a package 
having one or more containers of oligonucleotide primers for detection of the protein of the 
invention in PCR assays or one or more containers of polynucleotide probes for the detection of the 
mRNA of the protein of the invention by in situ hybridization or Northern analysis. Kits may, 

35 optionally, include containers of various reagents used in various hybridization assays. The kit may 
also, optionally, contain one or more of the following items: polymerization enzymes, buffers, 
instructions, controls, or detection labels. Kits may also, optionally, include containers of reagents 
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mixed together in suitable proportions for performing the hybridization assay methods in 
accordance with the invention. Reagent containers preferably contain reagents in unit quantities that 
obviate measuring steps when performing the subject methods. 

In another embodiment, the invention relates to methods and compositions for detecting and 
5 quantifying the level of the protein of the invention present in a particular biological sample. These 
methods are useful for the diagnosis or prognosis of diseases associated with an altered levels of the 
protein of the invention. Diagnostic assays to detect the protein of the invention may comprise a 
biopsy, in situ assay of cells from organ or tissue sections, or an aspirate of cells from a tumor or 
normal tissue. In addition, assays may be conducted upon cellular extracts from organs, tissues, 

10 cells, urine, or serum or blood or any other body fluid or extract. 

Assays for the quantification of the PBR/LBP of SEQ ID NO: 41 1 may be performed 
according to methods well known in the art. Typically, these assays comprise contacting the sample 
with a ligand of the protein of the invention or an antibody (polyclonal or monoclonal) which 
recognizes the protein of the invention or a fragment thereof, and detecting the complex formed 

15 between the protein of the invention present in the sample and the ligand or antibody. Fragments of 
the ligands and antibodies may also be used in the binding assays, provided these fragments are 
capable of specifically interacting with the BRP/IRP of the subject invention. Further, the ligands 
and antibodies which bind to the BRP/IRP of the invention may be labeled according to methods 
known in the art. Labels which are useful in the subject invention include, but are not limited to, 

20 enzymes labels, radioisotopic labels, paramagnetic labels, and chemiluminescent labels. Typical 
techniques are described by Kennedy, J. H., et al. (1976) Clin. Chim. Acta 70:1-31; and Schurs, A. 
H. et al. (1977) Clin. Chim. Acta 81: 1-40. 

The subject invention also provides methods and compositions for the identification of 
metastatic tumor masses. In this aspect of the invention, the polypeptides and antibodies which 

25 bind the polypeptides of the invention may be used as a marker for the identification of the 

metastatic tumor mass. Metastatic tumors which originated from the liver may overexpress the 
PBR/IBP of SEQ ID NO: 411, whereas newly forming tumors, or those originating from other 
tissues are not expected to bear the PBR/IBP of SEQ ED NO: 411. 

Protein of SEP ID NO: 397 (internal designation 160-28-4-0-C4-CSV 

30 The protein of SEQ ID NO: 397, encoded by the cDNA of SEQ ED NO: 156 (clone 160-28- 

4-0-C4-CS), exhibits homology to the ADP-ribosylation factors (ARF) family of proteins. The 
ARF family includes ADP-ribosylation factors (ARFs) and ARF-like proteins (ARLs); the ARF 
family of proteins is one family of the Ras superfamily. Proteins belonging to the Ras superfamily 
have molecular weights of 18-30 kDa and function in a variety of cellular processes including, but 

35 not limited to, signaling, growth, immunity, and protein transport. 
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ARFs are monomelic GTP-binding proteins, related structurally to both G protein alpha- 
subunits and Ras proteins. ARF family members share more than 60% sequence identity, appear to 
be ubiquitous in eukaryotes, and are evolutionarily highly conserved throughout. Immunologically, 
they have been localized to the Golgi apparatus of several types of cells (Stearns et al. Proc. Natl. 
5 Acad. Sci. (USA) 87:1238-1242 (1990)). ARF proteins enhance the ADP-ribosyltransferase 
activity of cholera toxin as an allosteric activator (Noda et al. Biochim. Biophys. Acta 1034: 195- 
199 (1990)). ARFs have also been shown to act as regulatory molecules, or "switches", for linking 
two processes (e.g., the process of vesicle fission from a donor compartment and fusion with an 
acceptor compartment (Rothman, J. E. and Wieland, F. T. Science 272: 227-234 (1996)). ARF 
10 family members fall into three classes, classes I-III, according to their size and sequence homology. 
Class I comprises ARF1, ARF2, and ARF3; Class II comprises ARF4 and ARF5; and Class III 
comprises ARF6. 

The classes occupy different subcellular locations and have been implicated in different 
transport pathways. Class I ARFs localize to the Golgi where they are involved in the regulation of 

15 ER-Golgi and intra-Golgi transport. Class I ARFs are also involved in the recruitment of cytosolic 
coat proteins to Golgi membranes during the formation of transport vesicles. Class III (e.g., ARF6) 
localizes to a tubulovesicular compartment, secretory granules, and the plasma membrane, where it 
is involved in regulated secretion and recycling. Class II ARFs appear to be cytosolic, but their role 
has not been elucidated. (Radhakrishna, H. and Donaldson, J. G. J. Cell Biol. 139: 49-61(1997)). 

20 ARF function, in general, is regulated by a GDP-GTP cycle. For example, ARF1 is 

cytosolic in the GDP bound state, but is associated with membranes when in the GTP bound state. 
A guanine nucleotide exchange factor (GEF) in the donor compartment recruits ARF1 to the 
membrane. At the membrane, GTP-ARF1 recruits coat proteins, which assemble together into 
spherical coats, budding off vesicles in the process. After budding, hydrolysis of bound GTP causes 

25 ARF1 to dissociate from the membrane. ARF1 dissociation causes the coat to become unstable and 
dissociate as well. (Rothman, supra.) 

Members of the ARF multigene family, when expressed as recombinant proteins in E. coli, 
display different phospholipid and detergent requirements (Price, et al. J. Biol. Chem. 267: 17766- 
17772 (1992)). Some lipids and/or detergents, e.g., SDS, cardiolipin, 

30 dimyristoylphosphatidylcholine (DMPC)/cholate, enhance ARF activities (Bobak, et al. 

Biochemistry 29:855-861 (1990); Noda, et al. Biochim. Biophys. Acta 1034: 195-199 (1990); Tsai, 
et al. J. Biol. Chem. 263:1768-1772 (1988)). ARFs also activate phospholipase D (PLD), a 
membrane-bound enzyme implicated as an effector of several growth factors (Boman, A. L. and 
Kahn, R. A. Trends Biochem. Sci. 20: 147-150 (1995). PLD1 has been shown to be activated by a 

35 variety of G-protein regulators, for example, PKC (protein kinase C) and ADP-ribosylation factor 
(ARF). PKC and ARFs may regulate G-proteins either individually or together in a synergistic 
manner. Recently the role of ARFs in microtubules formation has also been demonstrated. ADP- 
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ribosylation of tubulin almost completely blocked self-assembly of this protein in brain (Terashima 
M. et a; J.Nutr Sci Vitaminol 45: 393-400 (1999)). 

In general, differences in the various ARF sequences are concentrated in the amino-terminal 
regions and the carboxyl portions of the proteins. Only three of 17 amino acids in the amino termini 
5 have shown to be identical among ARFs, and four amino acids in this region of ARFs 1 -5 are 

missing in ARF 6 (Tsuchiya, et al. J. Biol. Chem. 266: 2772-2777 (1991)). It was reported (Kahn, et 
al. J. Biol. Chem. 267:13039-13046 (1992)) that the amino-terminal regions of ARF proteins form 
an alpha-helix and that this domain is required for membrane targeting, interaction with lipid, and 
ARF activity. 

10 Schliefer et al., (J. Biol. Chem. 257: 20-23 (1991)) have described a protein distinctly larger 

than ARF that possessed ARF-like activity. ARF-like proteins, or ARLs, have been found in 
different species. Some of ARLs appear to lack ADP-ribosyltransferase-enhancing activity; ARLs 
may differ in GTP-binding requirements and GTPase activity as compared to various ARF 
isoforms. For example, ARP, a mammalian ARL, is 33-39% identical to members of the ARF 

15 family; ARP, however, differs from other ARF family proteins by virtue of its ability to hydrolyze 
bound GTP in the absence of other proteins. ARP protein, unlike ARFs, is typically associated with 
plasma membrane instead of the cytosol (Schurmann, A. J. Biol. Chem. 270, 30657-30663 (1995)). 

ARF family members have been implicated in several disease processes, such as Lowe's 
syndrome, an X-linked disorder characterized by congenital cataracts, renal tubular dysfunction and 

20 neurological deficits. These disorders may be due to an inability to recruit ARF to the Golgi 

membrane (Suchy, S. F. et al. Hum. Mol. Genet. 4: 2245-2250 (1995), Londono I. et al. Kidney Int. 
55: 1407-1416 (1999)). It has also been suggested that regulation of ARF is also involved in cystic 
fibrosis, Dent's disease, diabetes, and autosomal dominant polycystic kidney disease (Marshansky, 
V., et al. Electrophoresis 18: 2661-2676 (1997)). 

25 The new human ARF-related protein of SEQ ED NO:397, encoded by clone 1 60-28-4-0-C4- 

CS in one embodiment, and the related polynucleotides, provide new compositions which are useful 
in the diagnosis, treatment, and prevention of secretory, exocytosis, endocytosis and another 
"sorting disorders." 

The subject invention provides a polypeptide comprising the amino acid sequence of SEQ 
30 ID NO: 397 or clone 1 60-28-4-0-C4-CS, or biologically active fragments thereof. The intact protein 
of interest is 173 amino acids in length, has an ARF family amino acid motif (Pfam), and has 
ATP/GTP-binding site motif A P-loop (PS00017). The protein of SEQ ID NO: 397 or clone 160- 
28-4-0-C4-CS also has chemical and structural similarity with human ARL1 (P40616), ARD-1 
(R66033) and ARF6 (GI 178989) (31%, 31% and 27% identity, respectively). The amino acid 
35 length of SEQ ED NO: 397 is similar to those of the aforementioned ARFs Biologically active 

fragments of SEQ ID NO: 397 have one or more of the biological activities typically associated the 
full length protein. In one embodiment, the protein is encoded by clone 160-28-4-0-C4-CS 
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The invention also provides variants of the protein of SEQ ID NO: 397 or clone 160-28-4- 
0-C4-CS. The variants have at least about 80%, more preferably at least about 90%, and most 
preferably at least about 95% amino acid sequence identity to the amino acid sequence of SEQ ID 
NO: 397 or clone 160-28^4-0-C4-CS. Variants according to the subject invention have at least one 
5 functional and/or structural characteristic of ARFs. The invention also provides biologically active 
fragments of the variant proteins. 

The invention includes those polynucleotides encoding the protein of SEQ ID NO: 397 or 
clone 160-28-4-0-C4-CS, variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, and biologically 
active fragments of both the protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS and variants 

10 thereof. As is apparent to those skilled in the art, a variety of different DNA sequences can encode 
the amino acid sequence of the proteins, variants, and biologically active fragments of said proteins 
and variants. It is well within the skill of a person trained in the art to create these alternative DNA 
sequences encoding proteins having the same, or essentially the same, amino acid sequence. These 
variant DNA sequences are also within the scope of the subject invention. As used herein, 

15 reference to "essentially the same" sequence refers to sequences that have amino acid substitutions, 
deletions, additions, or insertions that do not materially affect biological activity. 

The subject invention provides method of treating cytoskeletal, secretory, and inflammatory 
disorders/conditions comprising the administration of therapeutically effective amounts of a 
composition comprising the protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. These 

20 methods can also be practiced using variants of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or 
biologically active fragments of either SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, or variants of 
SEQ ID NO: 397 or clone 160-28-4-0-C4-CS. Disorders/conditions which can be treated by the 
subject invention include, but are not limited to, prostate cancer, brain and another tumors, Lowe's 
syndrome, glomerulonephritis, chronic glomerulonephritis, tubulointerstitial nephritis, inherited X- 

25 linked nephrogenic diabetes insipidus, autosomal dominant polycystic kidney disease (ADPKD), 
herpes gestationis, dermatitis herpetiformis, lupus erythematosus, Crohn's disease, irritable bowel 
syndrome and Addison's disease; secretory/endocytotic disorders such as cystic fibrosis, glucose- 
galactose malabsorption syndrome, hypercholesterolemia, hyper- and hypoglycemia, Grave's 
disease, goiter, and Cushing's disease; conditions associated with abnormal vesicle trafficking, 

30 including acquired immunodeficiency syndrome (AIDS); allergies including hay fever, asthma, and 
urticaria (hives); autoimmune hemolytic anemia; multiple sclerosis; myasthenia gravis; rheumatoid 
and osteoarthritis; Chediak-Higashi and Sjogren's syndromes; toxic shock syndrome; traumatic 
tissue damage; viral, bacterial, fungal, helminthic, and protozoal infections. 

In another embodiment, a vector capable of expressing the protein of SEQ ID NO: 397 or 

35 clone 160-28-4-0-C4-CS, or biologically active fragments thereof, can be administered to a subject 
to treat or prevent disorders including, but not limited to, those described above. Alternatively, the 
vector can encode a variant, or biologically active fragment of the variant protein. Multiple vectors 
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encoding any combination of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS and/or variants can 
be administered to a subject. 

In a further embodiment, a pharmaceutical composition comprising a substantially purified 
5 protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments 
thereof), in conjunction with a suitable pharmaceutical carrier, can be administered to a subject to 
treat or prevent the above mentioned disorders. Alternatively, a pharmaceutical composition 
comprising a substantially purified variant protein of SEQ ID NO: 397 or clone 160-28-4-0-C4-CS 
(and/or biologically active fragments thereof), in conjunction with a suitable pharmaceutical carrier, 

10 can be administered in the aforementioned therapeutic regimens. As would be apparent to the 
skilled artisan, any therapeutically effective combination of the protein encoded by SEQ ID NO: 
397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof) and variants of SEQ 
ID NO:397 or clone 160-28-4-0-C4-CS (and/or biologically active fragments thereof), in 
conjunction with a suitable pharmaceutical carrier can be used in the aforementioned therapeutic 

15 regimens. 

ARFs are known to be involved in regulated transport of vesicles. Therefore, in another 
embodiment, the protein of SEQ ID No: 397 or clone 160-28-4-0-C4-CS, variants, and/or 
biologically active fragments of said proteins and/or variants can be used as a component of drug 
delivery vehicles such as colloids or liposomes. The protein of SEQ ID NO: 397 or clone 160-28-4- 

20 0-C4-CS, variants, and/or biologically active fragments of said proteins and/or variants can be 

incorporated into the lipid membranes of liposomes and can serve as specific targeting agents. The 
methods of design of such drug delivery systems is known by those skilled in the art and can be 
practiced according to conventional pharmaceutical principles (Smith H.J. Introduction to the 
principles of drug design and action, 3 rd ed. (1998); Chien Y.W. Novel Drug Delivery systems, 2 nd 

25 ed. (1992); Storm G. et al J.Liposome Res. 4: 641-666 (1994); and Crommelin D.J.A. et al. Adv. 
Drug Delivery Rev. 17 : 49-60 (1995)). 

In another embodiment of the invention, the polynucleotides encoding the protein of SEQ 
ID NO: 397 or clone 160-28-4-0-C4-CS can be used for therapeutic purposes. Polynucleotides 
encoding fragments of the protein of SEQ ID NO:397 or clone 160-28-4-0-C4-CS can also be used 

30 in therapeutic regiments. In one aspect, the complement of the polynucleotide encoding the protein 
of SEQ ID NO.: 397 or clone 160-28-4-0-C4-CS can be used in situations in which it would be 
desirable to block the transcription of the mRNA. Modifications of gene expression can be obtained 
by designing complementary sequences or antisense molecules (DNA, RNA, or PNA) to the 
control, 5', or regulatory regions of the gene encoding the protein of interest. Such technology is 

35 now well known in the art, and sense or antisense oligonucleotides or larger fragments can be 
designed from various locations along the coding or control regions of sequences encoding the 
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protein of interest. Methods of treatment utilizing antisense technology are also well known to 
those skilled in the art. 

Another embodiment of the invention provides methods of assessing PLD modulation by 
using ARF properties of the protein of interest. 
5 In another embodiment, antibodies which specifically bind the protein of SEQ ID NO: 397 

or clone 160-28-4-0-C4-CS can be used for the diagnosis of disorders characterized by expression 
of the protein, or in assays to monitor patients being treated with the protein of interest. Methods of 
making both polyclonal and monoclonal antibodies are well-known in the art. Diagnostic assays 
which can be used in this aspect of the invention include, and are not limited to, ELISAs, RIAs, and 

10 FACS, and are well known in the art. These assays also provide a basis for diagnosing or 
identifying altered or abnormal levels of SEQ ID NO:397 or the polypeptides encoded by the 
human cDNA of clone 160-28-4-0-C4-CS expression as compared to normal individuals. These 
screening methods are, likewise, well known to the skilled artisan. 

In another embodiment of the invention, the protein of interest, its catalytic or immunogenic 

15 fragments, or oligopeptides thereof can be used for screening libraries of compounds in any of a 
variety of drug screening techniques. The fragment employed in such screening can be free in 
solution, affixed to a solid support, recombinantly expressed on, or chemically attached to, a cell 
surface, or located intracellularly. The formation of binding complexes between the protein of 
interest and the agent being tested can be measured by methods well known to those skilled in the 

20 art. Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest. (See, e.g., Geysen, et al. (1984) PCT 
application WO84/03564.) 

In another embodiment of the invention, the polynucleotides encoding the protein of interest 
can be used for diagnostic purposes. The polynucleotides can be used to detect and quantify gene 

25 expression in biopsied tissues in which expression of the protein of interest can be correlated with a 
disease or condition. Such diagnostic assays are well known in the art and can be used to monitor 
regulation of the protein of interest levels during therapeutic intervention and/or to determine 
absence, presence, and excess expression of the protein of interest. Examples of such conditions 
and disorders have been provided supra. The polynucleotide sequences encoding the protein of 

30 interest can be used, for example, in Southern or Northern analyses, dot blot, or other membrane- 
based technologies; in PCR technologies; in dipstick, pin, and ELISA assays; and in microarrays 
utilizing fluids or tissues from patients to detect altered expression of the protein of SEQ ID 
NO:397 or clone 160-28-4-0-C4-CS. Such qualitative or quantitative methods are well known in the 
art. 

35 In further embodiments, oligonucleotides or longer fragments derived from any of the 

polynucleotide sequences described herein can be used as targets in a microarray. The microarray 
can be used to monitor the expression level of large numbers of genes simultaneously and to 
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identify genetic variants, mutations, and polymorphisms. This information can be used to determine 
gene function, to understand the genetic basis of a disorder, to diagnose a disorder, and to develop 
and monitor the activities of therapeutic agents. Microarrays can be prepared, used, and analyzed 
using methods known in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No. 5,474,796; 
5 Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. 94: 2150-2155; and Heller, M. J. et al. (1997) U.S. 
Pat. No. 5,605,662.) 

Another embodiment of the subject invention provides nucleic acid sequences encoding the 
protein of interest which can be extended utilizing a partial nucleotide sequence and various PCR- 
based methods. This aspect of the invention provides methods for the detection of upstream 

10 sequences, such as promoters and regulatory elements. Methods of practicing this aspect of the 
invention are also well known in the art. 

In other embodiments of the disclosed therapeutic regimens, any of the proteins, variants, 
biologically active fragments, antibodies, complementary sequences, or vectors of the invention can 
be administered in combination with other appropriate therapeutic agents. Selection of the 

15 appropriate agents for use in combination therapy can be made by one of ordinary skill in the art. 
The combination of therapeutic agents can act synergistically to effect the treatment or prevention 
of the various disorders described above. In particular, purified protein can be used to produce 
antibodies or to screen libraries of pharmaceutical agents to identify those which specifically bind 
the protein of interest. Neutralizing antibodies especially preferred for therapeutic use. 

20 Protein of SEP ID NO: 287 (internal designation 1 74-5-3-0-H7-CS) 

The protein of SEQ ID NO: 287, encoded by human cDNA of SEQ ID NO: 46 (clone 174- 
5-3-0-H7-CS), is highly homologous (more than 99% identity in amino acids) to the human protein 
encoded by the CLN8 gene listed in Genbank under accession number AF123757. The two 
proteins differ by two conservative amino-acid substations (alanine for valine at position 155 and 

25 serine for asparagine at position 225). In addition, the protein encoded by 174-5-3-0-H7-CS 

contains seven transmembrane domains. These domains are located at amino acids 25-45, 71-91, 
100-120, 133-153, 160-180, 205-225, and 228-248 as predicted by the software TopPred II (Claros 
and von Heijne, CABIOS applic. Notes, 10:685-686 (1994)). The protein encoded by SEQ ID 
NO: 287 also exhibits a signal peptide at positions 1-50 and a retention signal KKRP from positions 

30 283 to 286. 

CLN8 was identified recently by positional cloning (Ranta et al., Nat Genet. 1999 
Oct.;23(2):233-6). CLN8 encodes a 286 amino-acid putative transmembrane protein with no 
homology to previously known proteins. A naturally-occurring missense mutation in codon 24 
(R24G at the border of the first putative transmembrane domain) is the molecular basis for EPMR 
35 ("progressive epilepsy with mental retardation", MIM 600143). EPMR, also called Northern 

Epilepsy, is an autosomal recessive disorder characterized by normal early development, onset of 
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generalized tonic-clonic seizures between the ages of 5 and 10 years, and subsequent progressive 
mental retardation. Neuropathological findings have shown that EPMR is a new member of the 
neuronal ceroid lipofuscinosis (NCL) group of neurodegenerative disorders. The NCLs are a 
genetically heterogeneous group of progressive neurodegenerative disorders characterized by the 
5 accumulation of autofluorescent lipopigment in various tissues. CLN8 is the eighth gene to be 
linked to the NCL group of neurodegenerative disorders. 

Subsequently, the homologous mouse gene (Cln8) was sequenced (82% nucleotide identity 
with the human gene) and localized to the region of the mouse genome linked to motor neuron 
degeneration, mouse mnd. Mnd is a naturally-occurring mouse mutant with intracellular 
10 autofluorescent inclusions similar to those seen in EPMR. A mutation in mnd mouse DNA was 
identified, indicating that mnd is a murine ortholog for CLN8 (Ranta et al., Nat Genet. 1999 
Oct;23(2):233-6), and that mice containing mutations in Cln8 represent a murine model for NCL 
disorders. 

Recent experimental evidence has confirmed the transmembrane nature of the CLN8 

15 protein (Lonka L et al., Hum Mol Genet. 2000 Jul 1;9(1 1):1691-7). CLN8 resides in the 
endoplasmic reticulum (ER) and recycles between the ER and the ER-Golgi intermediate 
compartment (ERGIC) via a KKXX ER-retrieval motif at its C -terminus (KKRP, amino-acids 283- 
286). This motif is recognized and bound by COPI, a vesicle-coating protein found in retrograde 
vesicles delivering cargo from the cis Golgi to the ER. The 30kD CLN8 protein is not processed 

20 during its maturation (in particular it is not N-glycosylated). The EPMR-associated R24G mutation 
does not alter cellular localization in humans. 

The subject invention provides a polypeptide encoded by SEQ ID NO: 287 and biologically 
active fragments of said polypeptide. Compositions comprising polypeptides and pharmaceutically 
acceptable carriers are likewise provided. Preferred polypeptides, and biologically active fragments 

25 thereof, have any of the biological activities or domains/motifs described herein and/or contain the 
amino acids of positions 155 and 225, 283 to 286. In one embodiment, the protein/polypeptide of 
SEQ ID NO: 287 is encoded by clone 174-5-3-0-H7-CS. 

The ER/ERGIC cellular localization of protein of this invention can be used to target 
compounds to the ER/ERGIC. This targeting can be observed using any of the techniques known to 

30 those skilled in the art including those described in (Lonka L et al., Hum Mol Genet. 2000 Jul 
1;9(1 1): 169 1-7). In this aspect of the invention, the protein of SEQ ID NO: 287, or biologically 
active fragments thereof can be used to target liposomes, vesicles, or colloids to the ER/ERGIC 
compartment where active agents can be delivered. Methods of making and using targeted 
liposomes are well known in the art. 

35 In another embodiment, liposomes comprising the protein of SEQ ED NO: 287 can contain a 

second targeting agent for the specific selection of a target cell. The second targeting agent can be 
selected for its ability to specifically target a cell or tissue. Thus, the second targeting agent can be 

376 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 PCT/IB00/01938 
specific for tumor markers, such as HER2. Alternatively, markers associated with specific cell 
types can be used (e.g., CD34, CD4, CD8, etc.). In a preferred embodiment, the second targeting 
agent is an antibody. Active agents include, but are not limited to, chemotherapeutic agents protein 
cross-linking agents, inhibitors of protein synthesis, anti-bacterial agents (e.g., antibiotics), antiviral 
5 agents, and/or anti-parasitic agents. The ability to bind the COPI coatomer can be assayed as 
described in (Cosson P, Letourneur F, Science. 1994 Mar 18;263(5 153): 1629-31). 

In another embodiment, the present invention provides methods of, and compositions for, 
identifying specific cellular compartments, such as the ER, ERGIC, and retrograde transport 
vesicles. This embodiment provides antibodies which specifically bind the protein of SEQ ID 

10 NO: 287, or biologically active fragments thereof, which are labeled with detectable markers, such 
as gold particles, enzymes, radioisotopes, or paramagnetic labels. ER, ERGIC, and retrograde 
transport vesicles can be identified in samples according to well-known immuno-diagnostic 
protocols. The antibodies, either monoclonal or polyclonal, can be made according to well-known 
methods. In a preferred embodiment, the antibodies bind to ER retention signal. 

1 5 In another embodiment, the protein of the invention or part thereof can be used as a reagent 

for differential identification of the tissue(s) or cell type(s) present in a biological sample and for 
diagnosis of diseases and conditions, which include, but are not limited to, asthma, pulmonary 
edema, atherosclerosis, restenosis, stroke potential, thrombosis and hypertension. Similarly, the 
protein of the invention, or biologically active fragments thereof, and antibodies thereto can provide 

20 immunological probes for differential identification of the tissue(s) or cell type(s). In a number of 
disorders listed above, particularly of the pulmonary and cardiovascular systems, expression of this 
protein at significantly higher or lower levels can be routinely detected in certain tissues or cell 
types (e. g., vascular tissues, cancerous and wounded tissues) or bodily fluids (e. g., lymph, serum, 
plasma, urine, synovial fluid and spinal fluid) or another tissue or cell sample taken from an 

25 individual having such a disorder, relative to the standard gene expression level, i.e., the expression 
level in healthy tissue or bodily fluid from an individual not having the disorder. 

Indeed, the 80 first amino-acids of the protein of the invention are identical to two 
polypeptides claimed in Patent WO 99/35158, hereby incorporated by reference in its entirety (SEQ 
ID NO:98 and SEQ ID NO: 162 corresponding to Geneseq accession numbers Y38413/Y38428 and 

30 Y38492) are over-expressed in pulmonary and endothelial tissues. 

The tissue distribution in pulmonary and endothelial tissues indicates that the protein 
product described in WO 99/35158 is useful for the treatment and diagnosis of cardiovascular and 
respiratory or pulmonary disorders such as asthma, pulmonary edema, pneumonia, atherosclerosis, 
restenosis, stroke, angina, thrombosis hypertension, inflammation, and wound healing. Those 

35 conditions can be diagnosed by determining the amount of the protein of the invention in a sample. 
Thus, antibodies raised against the protein of SEQ 3D NO: 287, or an immunogenic fragment of the 
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protein can be used in diagnostic, prognostic, or screening assays such as those taught in WO 
99/35158. 



Protein of SEP ID No. 270 (internal designation 1 16-1 19-3-0-H5-CS) 

The protein of SEQ ID NO: 270 encoded by the extended cDNA SEQ ID NO: 29 is 
5 homologous to the human mitochondrial ATP synthase f subunit or ATPK (E.C. 3.6. 1 .34) (Swissprot 
accession number P56134) and is overexpressed in fetal kidney. 

The protein of SEQ ID NO: 270, composed of 88 amino acid residues, contains 1 
transmembrane segment (position 1 to 55) predicted by the software TopPred II (Claros and von 
Heijne, CABIOS applic. Notes, 10 :685-686 (1994). BLAST results show that 100% homology is found 

10 between amino acids 5 to 88 of the protein of the invention and amino acids 10 to 93 of human ATP 
synthase f chain (93 amino acids total), exon 1 of the cDNA SEQ ID NO: 29 making the difference 
between the 2 proteins (the last 3 exons show 100% homology). Thus, the protein of the invention 
represents a new isoform of human mitochondrial ATP synthase f subunit. It is interesting to note that 
the same splice variant is found in bovin, pig and mouse species. 

1 5 The mitochondrial electron transport (or respiratory) chain is a series of enzyme complexes in 

the mitochondrial membrane that is responsible for the transport of electrons from NADH to oxygen 
and the coupling of this oxidation to the synthesis of ATP (oxidative phosphorylation). ATP then 
provides the 

primary source of energy for driving a cell's many energy-requiring reactions. ATP synthase 

20 (F0 Fl ATPase) is the enzyme complex at the terminus of this chain and serves as a reversible coupling 
device that interconverts the energies of an electrochemical proton gradient across the mitochondrial 
membrane into either the synthesis or hydrolysis of ATP. This gradient is produced by other enzymes 
of the respiratory chain in the course of electron transport from NADH to oxygen. When the cell's 
energy demands are high, electron transport from NADH to oxygen generates an electrochemical 

25 gradient across the mitochondrial membrane. Proton translocation from the outer to the inner side of the 
membrane drives the synthesis of ATP. Under conditions of low energy requirements and when there is 
an excess of ATP present, this electrochemical gradient is reversed and ATP synthase hydrolyzes ATP. 
The energy of hydrolysis is used to pump protons out of the mitochondrial matrix. ATP synthase is, 
therefore, a dual complex, the F0 portion of which is a transmembrane proton carrier or pump, and the 

30 Fl portion of which is catalytic and synthesizes or hydrolyzes ATP. Mammalian ATP synthase 
complex consists of sixteen different polypeptides (Walker, J. E. and Collinson, T. R. (1994) FEBS 
Lett.346: 39-43). Six of these polypeptides (subunits alpha, beta, gamma, delta, epsilon, and an ATPase 
inhibitor protein IF 1) comprise the globular catalytic F 1 ATPase portion of the complex, which lies 
outside of the mitochondrial membrane. The remaining ten polypeptides (subunits a, b, c, d, e, f, g, F6, 

35 OSCP, and A6L) comprise the proton-translocating, membrane spanning F0 portion of the complex. 
Like other members of the respiratory chain, all but two of the polypeptide subunits of ATP synthase 
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are nuclear gene products that are imported into the mitochondria. Enzyme complexes similar to 
mammalian ATP synthase are found in all cell types and in chloroplast and bacterial membranes. This 
universality indicates the central importance of this enzyme to ATP metabolism. Transcriptional 
regulation of these nuclear encoded genes appears to be the predominant means for controlling the 
5 biogenesis of ATP synthase. Multiple mitochondrial pathologies exist because of the essential role of 
mitochondrial oxidative phosphorylation in cellular energy production, in the generation of reactive 
oxygen species and in the initation of apoptosis (Wallace, Science, 283:1482-1488, 1999). It is now 
clear that mitochondrial diseases encompass an assemblage of clinical problems commonly involving 
tissues that have high energy requirements such as heart, muscle and the renal and endocrine systems. 

1 0 Over the past 1 1 years, a considerable body of evidence has accumulated implicating defects in the 
mitochondrial energy-generating pathway, oxidative phosphorylation, in a wide variety of degenerative 
diseases including myopathy and cardiomyopathy. Most classes of pathogenic mitochondrial DNA 
mutations affect the heart, in association with a variety of other clinical manifestations that can include 
skeletal muscle, the central nervous system (including eye), the endocrine system, and the renal system. 

15 Nuclear mutations causing mitochondrial disorders have been described. They are often found in highly 
conserved subunits. Mitochondrial disorders with nuclear mutations include : myopathies (PEO, 
MNGEE, congenital muscular dystrophy, carnitine disorders), encephalopathies (Leigh, Infantile, 
Wilson's disease, Deafhess-Dystonia syndrome), other systemic disorders and cardiomyopathies. 

The discovery of a new ATP synthase subunit, and polynucleotides encoding it satisfy a need 

20 in the art by providing new compositions which are useful for the diagnosis, prevention, and treatment 
of cancer, myopathies, immune disorders, and neurological disorders. 

It is believed that the protein of SEQ ID NO: 270 or part thereof plays a role in cellular 
respiration, preferably as a mitochondrial ATP synthase subunit. Preferred polypeptides of the 
invention are fragments of SEQ ID NO: 270 having any of the biological activity described herein. 

25 An object of the present invention are compositions and methods of targeting heterologous 

compounds, either polypeptides or polynucleotides to mitochondria by recombinantly or chemically 
fusing a fragment of the protein of the invention to an heterologous polypeptide or polynucleotide. 
Preferred fragments are signal peptide, amphiphilic alpha helices and/or any other fragments of the 
protein of the invention, or part thereof, that may contain targeting signals for mitochondria 

30 including but not limited to matrix targeting signals as defined in Herrman and Neupert, Curr. 
Opinion Microbiol. 3:210-4 (2000); Bhagwat et al. J. Biol. Chem. 274:24014-22 (1999), Murphy 
Trends Biotechnol. 15:326-30 (1997); Glaser et al. Plant Mol Biol 38:31 1-38 (1998); Ciminale et al. 
Oncogene 1 8:4505-14 (1999). Such heterologous compounds may be used to modulate 
mitochondria's activities. For example, they may be used to induce and/or prevent mitochondrial- 

35 induced apoptosis or necrosis. In addition, heterologous polynucleotides may be used for 
mitochondrial gene therapy to replace a defective mitochondrial gene and/or to inhibit the 
deleterious expression of a mitochondrial gene. 
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The invention further relates to methods and compositions using the protein of the invention 
or part thereof to diagnose, prevent and/or treat several disorders in which mitochondrial respiratory 
electron transport chain is impaired, including but not limited to mitochondriocytopathies, necrosis, 
aging, myopathies, cancer and neurodegenerative diseases such as Alzheimer's disease, 
5 Huntington's disease, Parkinson's disease, epilepsy, Down's syndrome, dementia, multiple sclerosis, 
and amyotrophic lateral sclerosis. For diagnostic purposes, the expression of the protein of the 
invention could be investigated using any of the Northern blotting, RT-PCR or immunoblotting 
methods described herein and compared to the expression in control individuals. For prevention 
and/or treatment purposes, the protein of the invention may be used to enhance electron transport 
10 and increase energy delivery using any of the gene therapy methods described herein or known to 
those skilled in the art. 

In another embodiment, The invention further relates to methods and compositions using the 
protein of the invention or part thereof to diagnose, prevent and/or treat several disorders in which 
mitochondrial respiratory electron transport chain needs to be impaired, including but not limited to 
1 5 Sjogren's syndrome, Addison's disease, bronchitis, dermatomyositis, polymyositis, glomerulonephritis, 
diabetes mellitus, emphysema, Graves' disease, atrophic gastritis, lupus erythematosus, myasthenia 
gravis, multiple sclerosis, autoimmune thyroiditis, ulcerative colitis, anemia, pancreatitis, scleroderma, 
rheumatoid and osteoarthritis, asthma, allergic rhinitis, atopic dermatitis, dermatomyositis, 
polymyositis, and gout, using any techniques known to those skilled in the art including the antisense or 
20 triple helices strategies described herein. 

Moreover, antibodies to the protein of the invention or part thereof may be used for 
detection of mitochondria organelles and/or mitochondrial membranes using any techniques known 
to those skilled in the art. 

Protein of SEP ID NO: 271 (internal designation 1 17-00 1-5-0-G3-CS) 

25 The protein of SEQ ID NO: 271 is homologous to the family of lipopolysaccharide (LPS) 

binding proteins (LBPs). Several families of proteins have the ability to bind LPS including (a) the 
lipopolysaccharide-binding proteins (LBPs), and (b) the bactericidal permeability-increasing 
proteins (BPIs). Cholesteryl ester transfer protein (CETP), which is involved in the transfer of 
insoluble cholesteryl esters in reverse cholesterol transport, shares some homology to members of 

30 the LPS binding family of proteins. 

Lipopolysaccharide (LPS), alternatively known as bacterial endotoxin, is a major component of 
the outer membrane of Gram-negative bacteria. It consists of serotype-specific O-side chain 
polysaccharides linked to a core oligosaccharide and Lipid A. LPS is a potent mediator of the 
inflammatory response and stimulates the expression of many pro-inflammatory and pro-coagulant 

35 compounds in monocytes, macrophages, and endothelial cells. While these responses are important in 
containing and eliminating localized infections, systemic exposure to LPS can lead to a number of 
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adverse effects. These include: (a) induction of an inflammatory cascade, (b) damage to the 
endothelium, (c) widespread coagulopathies, and (d) organ damage. 

Systemic exposure to LPS can arise from direct infection by Gram-negative bacteria, leading to 
the complications of Gram-negative sepsis. Examples of diseases which are associated with Gram- 
5 negative bacterial infection or endotoxemia (including bacterial meningitis, neonatal sepsis, cystic 
fibrosis, inflammatory bowel disease, and liver cirrhosis), Gram-negative pneumonia, Gram-negative 
abdominal abscess, hemorrhagic shock, and disseminated intravascular coagulation. Subjects who are 
leukopenic or neutropenic, including subjects treated with chemotherapy or immunocompromised 
subjects, are particularly susceptible to bacterial infection and the subsequent effects of endotoxin 
10 exposure. 

Gram-negative sepsis remains one of the primary causes of severe systemic inflammation in 
hospitalized and immunocompromised patients. Alternatively, changes in gut permeability by a variety 
of circumstances, including trauma, can lead to translocation of bacteria/LPS into the bloodstream. 
Bacteria translocated from the gut is thought to play a major role in post-surgical immunosuppression 

15 (Little et al., Surgery. 1 14: 87-91 (1993)) and hemorrhagic shock. Therefore, there is a great interest to 
characterize proteins involved in the biological response to LPS and to discover therapies that can 
counteract the effects of LPS in pathological situations. 

LBP is a 60 kDa glycoprotein synthesized in the liver and present in normal human serum. 
LBP expression is upregulated in response to infectious, inflammatory, and toxic mediators. LBP 

20 expression has been induced in animals challenged with LPS, silver nitrate, turpentine, and 
Corynebacterium parvum (Geller et ah, Surgery 128:22-28 (1993); Gallay et al., Infect. Immun. 
61:378-383 (1993); Tobias et al., J. Exp. Med. 164:77-793 (1986)). LBP levels are correlated with 
exposure to LPS, and elevated levels (particularly persistent elevated levels) have been correlated with 
poor clinical outcomes in septic patients (U.S. Patent Nos. 5,484,705, and 5,804,367, hereby 

25 incorporated by reference in their entirety). 

A portion of the LBP molecule (the N-terminal 1-197 aa) binds to the lipid A portion of the 
LPS molecule to form a high affinity LBP/LPS complex (Tobias, et al., J. Biol. Chem 264: 10867- 
10871 (1989)). The LBP/LPS complex potentiates the cellular response to LPS via an interaction 
with the monocytic differentiation antigen CD14 (Wright et al., Science. 249: 1431-1433 (1990); 

30 Lee et al., J. Exp. Med. 175:1697-1705 (1992)). LPS can be transferred from LBP to membrane- 
bound or soluble CD 14. Activated CD 14 can then interact with endothelial cells to elicit an 
inflammatory response. The C-terminal portion of LBP is required to transfer LPS to CD 14 (U.S. 
Pat. No. 5,731,415; Theofan et al., J. Immunol. 152:3624-29 (1994); Han et al., J. Biol. Chem. 
269:8172-75 (1994)). Evidence also suggests that LBP can neutralize LPS by an interaction with 

35 serum lipoproteins or through the internalization of an LBP/LPS/CD14 complex by neutrophils 
(Wurfel et al., J. Exp. Med. 180:1025-1035 (1994); Wurfel et al., J. Exp. Med. 181:1743-54 (1995); 
Gegner et al., J. Biol. Chem. 20:5320-5325 (1995)). 
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The subject invention provides the polypeptide of SEQ ID NO: 271 and polynucleotide 
sequences encoding the amino acid sequence of SEQ ID NO: 271. In a one embodiment, the 
polypeptides of SEQ ID NO: 271 are interchanged with the polypeptides encoded by the human 
cDNA of clone 181-20-3-0-B5-CS. Also included in the invention are biologically active fragments 
5 of the protein of SEQ ID NO: 271 and polynucleotide sequences encoding these biologically active 
fragments. In a preferred embodiment, biologically active fragments of SEQ ID NO: 271 are 
encoded by clone 181-20-3-0-B5-CS and comprise the first 181 amino acids encoded by clone 181- 
20-3-0-B5-CS. "Biologically active fragments" are defined as those peptide or polypeptide 
fragments of SEQ ID NO: 271 which have at least one of the biological functions of the full length 

10 protein (e.g., the ability to bind bacterial LPS). 

The invention also provides variants of SEQ ID NO: 271. These variants have at least 
about 80%, more preferably at least about 90%, and most preferably at least about 95% amino acid 
sequence identity to the amino acid sequence of SEQ ID NO: 271. Variants according to the 
subject invention also have at least one functional or structural characteristic of SEQ ID NO: 271, 

15 such as the biological functions described above. The invention also provides biologically active 
fragments of the variant proteins. Unless otherwise indicated, the methods disclosed herein can be 
practiced utilizing the polypeptide of SEQ ID NO: 271 or variants thereof. Likewise, the methods 
of the subject invention can be practiced using biological fragments of the protein of SEQ ID NO: 
or variants of said biologically active fragments. 

20 Because of the redundancy of the genetic code, a variety of different DNA sequences can 

encode SEQ ID NO: 271. It is well within the skill of a person trained in the art to create these 
alternative DNA sequences which encode proteins having the same, or essentially the same, amino 
acid sequence. These variant DNA sequences are, thus, within the scope of the subject invention. 
As used herein, reference to "essentially the same sequence" refers to sequences that have amino 

25 acid substitutions, deletions, additions, or insertions that do not materially affect biological activity. 
Fragments retaining one or more characteristic biological activity of SEQ ID NO: are also included 
in this definition. 

"Recombinant nucleotide variants" are alternate polynucleotides which encode a particular 
protein. They can be synthesized, for example, by making use of the "redundancy" in the genetic 
30 code. Various codon substitutions, such as the silent changes which produce specific restriction 
sites or codon usage-specific mutations, can be introduced to optimize cloning into a plasmid or 
viral vector or expression in a particular prokaryotic or eukaryotic host system, respectively. 

The protein of SEQ ID NO: 271, and variants thereof, can be used to produce antibodies 
according to methods well known in the art. The antibodies can be monoclonal or polyclonal. 
35 Antibodies can also be synthesized against fragments of SEQ ID NO: 271, as well as variants 
thereof, according to known methods. The subject invention also provides antibodies which 
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specifically bind to biologically active fragments of SEQ ID NO: 271 or biologically active 
fragments of SEQ ID NO: 271 variants. 

The subject invention also provides for immunoassays which are used to screen for, 
monitor, or diagnose exposure to LPS. In one embodiment, diagnostic assays measure the level of 
5 LBP in patient plasma samples. LBP levels are known to rise in response to exposure to LPS, thus 
the measurement of the level of the protein of SEQ ID NO: 271 can provide an early indication of 
Gram-negative infection or of endotoxin exposure. 

The subject invention provides methods of treating individuals infected with Gram negative 
bacteria comprising the administration of therapeutical ly-effective compositions comprising SEQ 
10 ID NO: 271. In one embodiment, the protein lacks the C-terminal portion (or portions of the C- 
terminal domain) necessary to transfer LPS to CD14. LPS can be scavenged by the excess N- 
terminal fragment and would be unable to induce an inflammatory response (see, e.g., U.S. Patent 
No. 5,731,415, hereby incorporated by reference in its entirety). 

Another aspect of the subject invention provides methods of prophylaxis. The method 
15 treats individuals by administration of therapeutically-effective amounts of compositions 
comprising SEQ ID NO: 271. Instances where this aspect of the invention can be performed 
include, but are not limited to, conditions associated with increased translocation of gut bacteria and 
endotoxin, particularly prior to surgery. In addition, patients who are at risk for potential Gram- 
infection, including but not limited to patients undergoing chemotherapy, or patients who are 
20 immunocompromised (for example with AIDS) can benefit from such treatment. Such uses are 
described in U.S. Patent No. 5,990,082, hereby incorporated by reference in its entirety. 

The N-terminal portion of LBP, which lacks the ability to induce an inflammatory response, 
can be fused to other proteins or fragments thereof (such as the bactericidal/permeability-increasing 
protein or BPI) which can increase the association of these molecules with LPS and aid in the 
25 clearance of endotoxin from patients who have been exposed to Gram negative bacteria. Such 
preparations can be used to treat and inhibit a number of Gram-negative infections, Gram positive, 
or fungal infections, as described in the following patents: WO 95/19179 A, WO 95/19180 A, WO 
95/19372 A, and WO 96/34873 A, each of which is incorporated by reference in its entirety. 

The subject invention also provides methods of removing endotoxin from recombinantly- 
30 produced proteins. In one embodiment, the recombinantly-produced proteins are obtained from 
Gram-negative bacteria. In a preferred embodiment, the bacteria are E. coli. In another 
embodiment, the protein of SEQ ID NO: 271, biologically active fragments thereof, variants, or 
derivatives thereof, are contacted with compositions comprising recombinantly-produced proteins. 
The contacting step can take place with SEQ ID NO: 271 immobilized on a substrate or with SEQ 
35 ID NO: 271 present in free solution. 

In addition, protein of SEQ ID NO: 271, biologically active fragments, or derivatives 
thereof, can be used in diagnostic assays to measure the level of LPS in patient plasma samples. In 
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such an assay, serum samples would be bound to a solid matrix, such as a membrane, plastic, 
treated plastic, or other supports, and then cloned with the protein of SEQ ED NO: 271 . 
Visualization can be achieved by fusing protein of SEQ ID NO: to any number of enzymes 
followed by treatment with a chromogenic, fluorogenic, or luminescent substrate. Alternatively, the 
5 protein of SEQ ID NO: 271, biologically active fragments, variants, or derivatives thereof, can be 
linked to a fluorescent or luminescent protein or compound. The linkage can be chemical or made 
by recombinant techniques known to those skilled in the art. In addition, antibodies raised against 
the protein of SEQ ID NO: 271, biologically active fragments, variants, or derivatives thereof can 
be used to visualize the LPS/protein 271 complexes using immunoassays known to those skilled in 
10 the art. 

Protein of SEP ID NO:266 (internal designation 1 16-1 10-2-0-F4-CS) 

The protein of SEQ ED NO:266, highly expressed in the testis, is encoded by cDNA of SEQ 
ID NO:25 and exhibits homology to the Ly-6 family of GPI-linked cell-surface glycoproteins 
composed of one or more copies of a conserved domain of about 100 amino-acid residues 

15 (PS00983;LY6_UPAR). 

The protein of SEQ ID NO:266 shows significant structural similarities to mouse Ly-6 
antigens, human CD59 and a herpes virus CD59 homolog. The protein of SEQ ED NO:266 
displays one copy of the motif of the u-PAR/Ly-6 domain, with all ten extracellular cysteine 
residues conserved. The mature protein sequence contains a relatively high proportion of cysteine 

20 residues (10/105), which suggests that numerous disulfide bonds stabilize its tertiary structure. 

Furthermore, the 124 amino-acid long protein of SEQ ID NO:266 has a size very similar to that of 
many members of the Ly-6 family. In addition, the protein of the invention has a predicted signal 
peptide structure (positions from 1 to 1 9) and a C-terminal hydrophobic fragment (positions from 
101 to 121) necessary for GPI-anchoring in a membrane. Thus, the protein of the invention has a 

25 clear evolutionary relationship with the Ly-6/uPAR family, particularly with Ly-6 subfamily. 

The Ly-6/uPAR protein family members share one or several repeat units of the Ly-6/uPAR 
domain, which is defined by a distinct disulfide bonding pattern between 8 or 10 cysteine residues. 
This family can be divided into two subfamilies. One comprises GPI-anchored glycoprotein 
receptors with 10 cysteine residues. Another subfamily includes the secreted single-domain snake 

30 and frog cytotoxins, and differs significantly in that its members generally possess only eight 
cysteines and no GPI-anchoring signal sequence (Andermann K, et al. Protein Sci 8(4):810-819 
(1 999)). The Ly-6 family members are low molecular weight phosphatidyl inositol anchored 
glycoproteins with remarkable amino acid homology throughout a distinctive cysteine rich protein 
domain that is associated predominantly with O-linked carbohydrate. Their GPI links are necessary 

35 to anchor these cell surface proteins to the outside of the lipid bilayer membrane. The Ly-6 family 
includes human CD59, which protects from complement-mediated membrane damage, squid Sgpl 
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and Sgp2, urokinase plasminogen activator receptor, murine Sca-1 and Sca-2, and many other 
proteins. The general structure seen within the Ly-6 family resembles that of the receptor for a 
urokinase-type plasminogen activator and the alpha- neurotoxins from snake venoms (Fleming T J 
et al J Immunol 150:5379-5390 (1993); Ploug M and V Ellis FEBS Lett 349:163-168 (1994)). 
5 The Ly-6 cell surface proteins are differentially expressed in several hematopoietic lineages 

that appear to function in signal transduction and cell activation predominantly on lymphoid cells in 
the mouse. Analyses using anti-Ly-6A/E monoclonal antibodies has also demonstrated in situ 
expression of Ly-6 molecules in brain tissue (staining primary associated with vascular elements 
throughout the brain). These proteins do not appear to be expressed during embryonic or neonatal 

10 stages of development (Cray C et al. Brain Res Mol Brain Res 8(1):9-15 (1990)). 

Ly-6 protein expression has been shown to be factor-dependent. For example, the 
expression of the Ly-6A/E, which normally occurs in hemopoietic stem cells, fibroblasts, and T and 
B lymphocytes, has been shown to be greatly induced by IFN-B in various tissues and cell lines. In 
addition, the Ly-6E Ag is associated with tyrosine kinases in T cells, and reduced expression of Ly- 

15 6E in T cells impairs normal functional responses, as well as tyrosine kinase activity, in these cells. 
Further, the DFNs are important in the generation of memory CD8+ T cells, and it has been 
demonstrated that the expression of Ly-6C Ag is a strong marker for the memory phenotype 
(Mehran M. et al. Journal of Immunology 163: 81 1-819 (1999)). Like their murine counterparts, a 
human homologue of Ly-6 genes, the 9804 gene, is responsive to IFNs. The 9804 gene is also 

20 inducible by retinoic acid during differentiation of acute promyelocyte leukemia cells. Further, 
cultured glial and neuronal cells express high levels of Ly-6A/E following incubation with 
cytokines, including rEFN-gamma. (Cray C et al. Brain Res Mol Brain Res 8(1):9-15 (1990)). 
Another member of the Ly-6 family, human protein RoBo-1, shows increased expression in 
response to two modulators of bone metabolism, estradiol and intermittent mechanical loading, 

25 suggesting a role in bone homeostasis (Noel LS et al. J Biol Chem, Vol. 273(7): 3878-3883 (1998)). 
Such factor-dependence of expression makes Ly-6 proteins either candidates or targets for 
alloresponses and autoimmune disease. For example, the high level factor-induced expression of 
LY-6s has been associated with lupus nephritis (Blake P G et al. J Am Soc Nephrol 4: 1 140-1 1 50 
(1993)). 

30 Murine Ly-6 molecules have interesting patterns of tissue expression during 

haematopoiesis, from multipotential stem cells to lineage committed precursor cells, and on specific 
leukocyte subpopulations in the peripheral lymphoid tissues. These patterns suggest an intimate 
association between the regulation of Ly-6 expression, and the development and homeostasis of the 
immune system (Gumley TP et al. Immunol Cell Biol 73(4):277-296 (1995)). Ly-6M messenger 

35 RNA (mRNA) is easily detectable in hematopoietic tissue (bone marrow, spleen, thymus, peritoneal 
macrophages) as well as kidney and lung (Patterson JM et al. Blood 95(1 0):3 125-3 132 (2000)). 
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Normally, human blood cells are protected against autologous complement activation by 
membrane proteins that block the assembly of functional complement pores. One such protein is 
human Ly-6 CD59. Administration of CD59 prevents hemolytic disease or thrombosis. Further, 
the CD59 protein may prevent the complement-mediated lysis and activation of endothelial cells 
5 that leads to hyper acute rejection, and therefore may be administered during xenogeneic organ 
transplantation (Binette, J. P. and Binette, M. B., Scanning Microcs., 7:1 107-10 (1993)). 

The surface receptor for urokinase plasminogen activator (uPAR) has been recognized in 
recent years as a key molecule in regulating plasminogen mediated extracellular proteolysis. 
Surface plasminogen activation controls the connections between cells, basement membrane and 

10 extracellular matrix, and therefore the capacity of cells to migrate and invade neighboring tissues 
(Roldan AL et al. EMBO J 9(2):467-474 (1990)). Certain factors of the PA system, such as u-PAR, 
have been detected in organs of the male reproductive tract in various species. The morphological 
study provide support for the involvement of the PA system in human male reproductive physiology 
(Gunnarsson M et al. Mol Hum Reprod 5(10):934-940 (1999)). 

1 5 LY-6 proteins have been suggested to play important roles in disorders such as cancers, 

nephopathies, autoimmune diseases, hemolytic disease, thrombosis, Alzheimer's disease, etc. 
Several members of the murine Ly-6 supergene family are clearly involved in the progression of 
certain mouse tumors, as their expression level is higher in highly malignant cells than in tumor 
cells with a lower malignancy phenotype. Sorting by flow cytometry of tumor cells to 

20 subpopulations expressing either high or low levels of Ly-6E.l yielded cells expressing a high or a 
low malignancy phenotype, respectively. Further, it was shown that LY-6 is highly expressed, on 
non-lymphoid tumor cells originating from a variety of tissues in mice. Upregulation or high 
expression is correlated with a more malignant phenotype which results in higher efficiency of local 
tumor production (Katz et al Int J Cancer 59:684-91 (1994)) . 

25 Cells derived from angiogenic tumors express a higher tumorigenicity phenotype and a 

higher capacity to produce artificial pulmonary metastases than cells from the poorly angiogenic 
tumors. These cells also express significantly higher levels of the lymphocyte activation protein 
Ly-6E, so the angiogenic phenotype appears to be coregulated with Ly-6 (Sagi-Assif O et al. 
Immunol Lett 54(2-3):207-13 (1996)). Some LY-6 proteins also block secretion of interleukin II 

30 (IL-2) which is an approved anticancer agent and a key regulatory hormone in cell-mediated 
immunity (Fleming T J and T R Malek J Immunol 153:1955-62 (1994)). IL-2 stimulates the 
proliferation of both T and natural killer cells and activates NK cells which can directly lyse freshly 
isolated, solid tumor cells. 

The high malignancy, high Ly-6E.l -expressing cells also expressed high levels of the 

35 receptor for urokinase plasminogen activator (uPAR), whereas low malignancy, low Ly-6E.l- 

expressing cells also expressed low levels of uPAR. Transfection studies have indicated that uPAR 
is causally involved in conferring a high malignancy phenotype upon tumor cells expressing high 
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levels of Ly-6E.l. E48, a human homologue of the murine ThB Ly-6 protein, is expressed on head 
and neck squamous carcinoma cells. In E48-stimulated cells, the binding of E48 to its 
microenvironmental ligand appears to transduce a signal that up-regulates the expression of the FX 
enzyme in these cells, leading to an increase in the levels of GDP-L-fucose (Rinat Eshel et al. J 
5 Biol Chem, Vol. 275(17):12833-12840 (2000)). A congenital disorder of leukocyte adhesion to 
vascular endothelium termed LADII is reflected in a generalized fiicose deficiency and major 
defects in leukocyte trafficking and function. Ly-6 loss-variants of a murine tumor exhibit 
alterations in the incorporation of fucose and mannose into cellular glycoconjugates (Witz IP J. 
Cell. Biochem. Suppl. 34:61-66 (2000)). 

10 It is believed that the protein of SEQ ID NO:266 is a novel member of the Ly-6 protein 

family, and is thus a specific cell-surface glycoprotein antigen involved in signal transduction and 
cell activation, proliferation and differentiation. Preferred polypeptides of the. invention are 
polypeptides comprising the amino acids of SEQ ID NO:266 from position 1 to position 18 and 
from position 19 to position 124. Other preferred polypeptides of the invention are any fragments 

1 5 of SEQ ID NO:266 having any of the biological activities described herein. 

In one embodiment, this invention relates to methods and compositions using the protein of 
the invention or part thereof as a marker protein to selectively identify tissues, preferably testis. For 
example, the protein of the invention or part may be used to synthesize specific antibodies using any 
technique known to those skilled in the art. Such tissue-specific antibodies may then be used to 

20 identify tissues of unknown origin, such as forensic samples, differentiated tumor tissue that has 
metastasized to foreign bodily sites, etc., or to differentiate different tissue types in a tissue cross- 
section using immunochemistry. 

Another embodiment of the present invention relates to methods of using of the protein of 
the invention or part thereof and related compounds and derivatives to diagnose developmental and 

25 malignant disorders in tissues including urogenital tissues and other tissues of the reproduction 

system of both sexes. For example, a biological sample is obtained from a patient with cancer or at 
risk of developing cancer, and the level of SEQ ID NO:25 polynucleotides or encoded polypeptides 
is detected within the cells of the sample. The detection of an elevated level of the SEQ ID NO:25 
polynucleotides or encoded polypeptides in the sample relative to a control level indicates the 

30 presence of malignant cells within the patient. The expression of the protein of the invention can be 
investigated using any of a number of methods, including, but not limited to, Northern blotting, RT- 
PCR or immunoblotting. 

Another embodiment of the invention relates to compositions and methods using the protein 
of the invention or part thereof in recombinant protein form as pharmacological agents in the 

35 treatment of developmental and malignant disorders in tissues including urogenital tissues and in 
other tissues of human reproduction system. Particulary, the protein of the invention or part thereof 
can be used in the treatment of disorders which are manifested by male sterility. 
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In another embodiment of the invention, antibodies which bind to the protein of the 
invention or part thereof are used in the treatment of tumors, e.g., human urogenital tumors, 
especially to enhance the secretion of interleukin II, which is an approved anticancer agent and key 
regulatory hormone in cell-mediated immunity. Such antibodies can be used alone or bound to a 
5 substance capable of ablating or killing cells as a therapy for urogenital disorders or cancers in 
which the protein of the invention is overexpressed. 

The protein of the invention or part thereof may also be used in the treatment of diseases 
which can require transplantation, including various forms of cancers such as genitourinary cancers, 
carcinomas, sarcomas, atherosclerosis, angiogenesis, and benign tumors. As mentioned above, Ly- 

10 6 family includes several proteins which are similar to the protein of the invention and which are 
capable of protecting cells from complement-mediated membrane damage. Therefore, in another 
embodiment of the invention, recombinant proteins encoded by SEQ ID NO:25 or fragments 
thereof are administered during xenogeneic tissue transplantation to prevent complement-mediated 
lysis and to block activation of endothelial cells, which normally leads to hyper-acute rejection. 

15 In addition, prevention of complement-mediated lysis may be particulary important in 

human and animal reproductive therapy, where functional survival of the germ cells during in vitro 
handling is crucial. Storage of sperm is of widespread importance in commercial animal breeding 
programs, human sperm donor programs, and in the treatment of certain disease states. For 
example, sperm samples may be frozen for men who have been diagnosed with cancer or other 

20 diseases that may eventually interfere with sperm production, as well as for assisted reproduction 
purposes where sperm may be stored for use at other locations or times. The procedures utilized in 
such cases include: washing a sperm sample to separate out the sperm-rich fraction from non-sperm 
components of a sample such as seminal plasma or debris; further isolating the healthy, motile 
sperm from dead sperm or from white blood cells in an ejaculate; freezing or refrigerating of sperm 

25 for use at a later date or for shipping to females at differing locations; extending or diluting sperm 
for culture in diagnostic testing or for use in therapeutic interventions such as in vitro fertilization or 
intracytoplasmic sperm injection (Cohen et al. 12 : 994-1001 (1997)). Once sperm have been 
washed or isolated, they are then extended (or diluted) in culture or holding media for a variety of 
uses (sperm analysis, diagnostic tests, assisted reproduction). Each of these uses for extended or 

30 diluted sperm requires a somewhat different formulation of basal medium (see, for review, US 

Patent No. 6,140,121 Ellington et al. Oct. 2000); however, in all cases sperm survival is suboptimal 
outside of the female reproductive tract. Novel additional components of a dilution or storage 
medium which could improve the functional preservation of sperm would be useful. Therefore, in 
another preferred embodiment of this invention, purified recombinant proteins encoded by SEQ ID 

35 NO:25 or fragments thereof can be added as components of pharmacological media designed to 
protect spermatozoa. The methods used to compose such preservation media are generally known 
by those skilled in the art (for ex., Oliver S.A . et al. US patent 5,897,987 Apr.1999; Cohen J. et al., 
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supra). Inversely, in yet another embodiment of this invention, ligands, inhibitors, neutralizing 
antibodies or other biological agents which recognize the protein of the invention and which bind it 
and which block it can be used as components of pharmacological formulations designed for male 
contraception purposes. 

5 In still another embodiment of this invention, chimeric ligands or derivatives which 

recognize the protein of the invention or part thereof and which could be internalized into cell can 
be used to design a system of drug delivery finely targeted toward urogenital and other tissues 
which express the protein of SEQ ID NO:266. For example, such recognizing molecules can be 
incorporated into the membranes of liposomes to allow the specific delivery of the liposomes to 
10 cells expressing the protein of SEQ ID NO:266. Methods of designing such drug delivery systems 
are known by those skilled in the art (Smith H.J. Introduction to the principles of drug design and 
action, 3 rd ed.(1998)). 

Proteins SEP ID NOs:417. 413. 418 (internal designations 1 88-45-1 -0-D3-CS. 1 88-26-4-0-F5-CS. 
and 1 88-5-1 -0-H6-CS) 

15 The proteins of SEQ ED NOs:417, 413, and 418, encoded by the cDNAs of SEQ ID NOs: 

176, 172, and 177, are expressed in the brain and exhibit strong homology with proteins with redox 
activity (see, e.g. Genbank accession numbers AK001293 and AF029689, and Geneseqp accession 
number: Y59180). 

The protein of SEQ ED No:418 (320 amino acids) is a variant of AK001293 (322 amino 

20 acids). AK001293 has six extra nucleotides, within the same ORE, as SEQ ID No:418, producing a 
longer protein. SEQ ED NO:41 8 exhibits the Pfam Zinc -binding dehydrogenase (adh zinc) 
signature from positions 16 to 313. SEQ ID NO:418 presents all the conserved residues of the 
motif except for a histidine that is thought to be a zinc-ligand. This lack of zinc-ligand residues is a 
feature of the quinone oxidoreductases (QOR), a subfamily of zinc-binding dehydrogenases. 

25 SEQ ID NO:413 (191 amino acids) shares the first 172 amino acids with SEQ ED NO:418. 

The deletion of one nucleotide at position 583 in the SEQ ED NO:413 cDNA sequence 
(corresponding to amino acid 173), however, creates a change of ORF compared to SEQ ID 
NO:418and AK001293. 

SEQ ID NO:417 is a short protein (20 amino acids) whose sequence corresponds to the N- 

30 terminal end of the other proteins of the invention. The presence of a T (instead of a G in public 
sequences and SEQ ED NOs:413 and 418) at position 128 on the cDNA creates a STOP codon, 
creating a shorter protein. 

SEQ ID NOs:417, 413 and 41 8 are similar to the QORs, a family of zinc-binding 
dehydrogenases. QORs are cytoplasmic redox-regulated flavoenzymes that catalyze the one or two 

35 electron reduction of quinones. QORs bind NADP and are inhibited by dicoumarol. 
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The activity of QORs protects cells against toxicity, mutagenicity, and cancer due to 
exposure to environmental and synthetic quinones and their precursors. Thus, QORs play a central 
role in monitoring cellular redox state and act to protect against oxidative stress induced by a 
variety of metabolic situations (Raina A.K. et al. (1999) Redox Rep. 4:23-7). The oxidoreductase 
5 activity also permits the activation of bioreductive anticancer drugs (Begleiter A. et al. (1996) Br. J. 
Cancer Suppl. 27:S9-14). 

The metabolism of quinones involves enzymatic reduction of the quinone by one or two 
electrons. In the activation of quinone-containing antitumor agents, this reduction results in the 
formation of the semiquinone or the hydroquinone of the anticancer drug. The consequence of 

10 these enzymatic reductions is that the semiquinone yields its extra electron to oxygen with the 
formation of superoxide radical anion and the original quinone. This reduction by a reductase 
followed by oxidation by molecular oxygen (dioxygen) is known as redox-cycling and continues 
until the system becomes anaerobic. In the case of a two-electron reduction, the hydroquinone 
could become stable, and as such, be excreted by the organism in a detoxification pathway. 

1 5 The cellular antioxidant response is mediated by a battery of detoxifying/defensive proteins. 

The promoters of genes that encode these proteins contain a common cis-element termed the 
antioxidant response element (ARE). Many transcription factors, including Nrf, Jun, Fos, Fra, Maf, 
YABP, ARE-BP 1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor bind to the ARE 
from various genes. Among these factors, Nrf-Jun heterodimers positively regulate ARE-mediated 

20 expression and induction of genes in response to antioxidants and xenobiotics (reviewed in 

Dhakshinamoorthy S. et al. (2000) Curr. Top Cell Regul. 36:201-16). On the other hand, c-Fos 
represses ARE-mediated gene expression (Venugopal, R., and Jaiswal, A.K. (1996) Proc. Natl. 
Acad. Sci. USA 93, 14960-5). 

Elevated levels of QOR activity have been reported in several kinds of tumors such as liver, 

25 colon, lung and breast (Belinsky M., Jaiswal A.K., (1993) Cancer Metastasis Rev 12:103-17). 
Bioreactive antitumor agents are an important class of anticancer drugs that require activation by 
reduction. For this reason, QORs are a potential target on which to base the development of new 
antitumor compounds. Certain QORs have already been implicated in the metabolism, activation 
and mechanism of cytotoxicity of some anticancer drugs such as mitomycin C, indoloquinone E09 ( 

30 Ross D. et al. (1994) Oncol. Res. 6:493-500), CB 1954 (Knox R.J. et al. (2000) Cancer Res. 
60:4179-86) or antiestrogens in breast cancer (Montano M.M., Katzenellenbogen B.S. (1997) 
PNAS 94:2581-6). 

In addition, some of the proteins of the QOR family are thought to play a role in the 
prevention of apoptosis following oxidative stress. The tumor suppressor gene p53 has been 
35 directly implicated in the induction of apoptosis in dividing cells and in hippocampal pyramidal 
neurons (Jordan J. et al. (1997) J. Neurosci 17:1397-405) and a QOR gene has been described as a 
p53-regulated gene (Kostic C, Shaw P.H. (2000) Oncogene 19:3978-87). 
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It is believed that the proteins of SEQ ID NOs:417, 413 and 418 have a redox activity, most 
likely as QORs. Thus, they are expected to act as an endogenous antioxidant against oxidative 
stress and may be able to use NADP as cofactor. The proteins of the invention may be used to 
deactivate toxins and to activate bioreductive anticancer drugs. In addition, they may prevent 
5 apoptosis following oxidative stress and be regulated by p53. Because proteins SEQ ID NOs:417 
and 413 do not contain the Pfam Zinc-binding dehydrogenase (adh zinc) signature, in contrast to 
SEQ ID NO:418, they may act as a competitive inhibitor, i.e. a dominant negative form, of the 
functional protein. 

The oxidoreductase activity of the proteins of the invention may be assayed using any 

10 technique known to those skilled in the art. For example, the measurement of the rate of oxidation 
of NADPH and oxygen consumption, and the detection of the semiquinone and reactive oxygen 
species, may be performed as described by Gutierrez P.L. (Gutierrez P.L . (2000) Front. Biosci. 
5:D629-38), or by any other method skilled in the art. The enzymatic activity of the proteins of the 
invention in different affected and control tissues may be assayed by histochemical staining. To 

15 confirm the role of the proteins of the invention in the cellular antioxidant response, in vitro and in 
vivo assays may be performed. Transcription levels of the genes coding for the proteins of the 
invention may be measured using standard techniques after exposure to quinones or derived 
compounds as beta-naphtoflavone (beta-NF), as described by Belinsky M. and Jaiswal A.K. (supra), 
as well as in response to transcription factors such as Nerf, Jun and c-Fos, or in the presence of p53. 

20 In one embodiment of the present invention, the present protein can be used to detect 

specific cell types in vitro or in vivo. For example, as the present proteins are overexpressed in the 
brain, reagents capable of specifically recognizing the present protein can be used as markers for 
brain cells. Brain-specific markers have a number of uses, including for the identification of 
specific tissues for histological analyses, as well as to detect the origin of tumor cells. In addition, 

25 as the expression of the present protein is likely induced by transcription factors such as Nrf, Jun, 
Fos, Fra, Maf, YABP, ARE-BP1, Ah (aromatic hydrocarbon) receptor, and estrogen receptor, as 
well as by p53, reagents specific for detecting the present protein can also be used as a marker for 
the activity of any of these proteins in vitro or in vivo. In view of the association between many of 
these proteins and diseases such as cancer, the ability to detect the presence or absence of the 

30 proteins provides powerful tools for disease diagnosis and screening. For any of these applications, 
the expression of the present protein can be detected using any standard method, including Northern 
blots, western blots, in situ hybridization, PCR, etc. 

In another embodiment, the proteins of the invention can serve as markers for cellular 
oxidative stress in vivo and in vitro. As such, the proteins of the invention or part thereof may be 

35 useful in the diagnosis of disorders in which oxidative stress is implicated, including a large variety 
of types of cancer as well as neurodegenerative disorders such as Alzheimer's disease (AD), 
amyothropic lateral sclerosis (ALS) or Parkinson disease (PD). For diagnostic purposes, the 
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expression of the protein of the invention may be investigated using, e.g. Northern blotting, RT- 
PCR or immunoblotting methods and compared to the expression in control individuals. An 
increased levels of the proteins of the invention in patients compared with controls indicates a major 
shift in redox balance and, thus, indicates the presence of the disease or of a susceptibility for the 
5 disease. 

The invention further relates to methods and compositions using the proteins of the 
invention or part thereof to prevent and/or treat disorders in which oxidative stress is implicated, 
including those mentioned above. For these purposes the proteins themselves, or polynucleotides 
encoding the proteins, or an activator of protein expression may be administrated to patients, or to 

10 disease-free individuals in case of increased susceptibility to one of these disorders. 

In another embodiment, the protein of the invention or part thereof is used to prevent cells 
from undergoing apoptosis. They may thus be useful in the diagnosis, treatment and/or prevention 
of disorders and processes in which apoptosis is deleterious, including but not limited to immune 
deficiency syndromes (including ADDS), type I diabetes, pathogenic infections, cardiovascular and 

15 neurological injury, alopecia, aging, degenerative diseases including AD and PD, dystonia, Leber's 
hereditary optic neuropathy and schizophrenia. For all such diagnostic purposes, the expression of 
the proteins of the invention can be investigated using any of the Northern blotting, RT-PCR or 
immunoblotting methods described herein and compared to the expression in control individuals. 
The invention relates to methods and compositions using the proteins of the invention or 

20 part thereof as detoxifying enzymes against quinones. There are a variety of quinones with a toxic 
effect in cells (e.g. quinones derived from the oxidation of phenolic metabolites of benzene, DA- 
quinones, or menadione). Thus, the proteins of the invention or part thereof may be protective 
against the hematotoxic and carcinogenic effects of benzene, as well as against benzene-caused 
diseases such as cancer, aplastic anemia and pancytopenia. Moreover, they may detoxify DA- 

25 quinones in the brain, thereby providing neuroprotection in Parkinson's Disease. In still another 
embodiment, the proteins of the invention or part thereof may protect cells against menadione- 
induced oxidative stress, with known effects on myocardial cells (Floreani M. et al (2000) Biochem 
Pharmacol. 60:601-5). For prevention and/or treatment purposes the proteins themselves, or 
polynucleotides encoding the proteins, or an activator of protein expression may be administrated. 

30 In another embodiment, the present proteins may be a target of chemotherapy specific to 

different kinds of cancer, to ensure a favorable response to anticancer drugs. Specifically, proteins 
of the invention or part thereof may be used as an activator of cytotoxic prodrugs of quinone family. 
Accordingly, the protein of the invention or part thereof may be administered to a patient in 
conjunction with a bioreductive anticancer agent in order to activate the drug. This co- 

35 administration may be by simultaneous administration, such as a mixture of the oxidoreductase and 
the drug, or by separate simultaneous or sequential administration. Cancer-specific antitumor 
agents based on QOR substrates may be designed as described by Xing J. et al. (Xing J. et al. (2000) 
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Med. Chem. 43:457-66) and assayed as described in Li B. et al. (Li B et al. (1999) Chem. Res. 
Toxicol. 12: 1042-9). Alternatively, as the present proteins may be overexpressed in tumor cells, 
such methods may be performed by simply detecting the level of the present protein in tumor cells, 
and administering the prodrug specifically to those patients found to have elevated levels of the 
5 protein in their tumor cells. 

Proteins of SEP ID NOs: 415. 310. 317 (internal designation 1 88-29-2 -0-H1-CS. 1 88-1 8-4-0-A9- 

CS. 1 88-9-2 -0-E1-CS) 

Mammalian inositol hexakiphosphate kinase 2 (EP6K2), an enzyme of the inositol 

phosphate pathway, has been cloned and described by two independent groups [Saiardi, A.; 
10 Erdument-Bromage, H.; Snowman, A. M.; Tempst, P.; and Snyder, S. H., (1999) Current Biology 9, 

1323-1326, and Katai, K.; Miyamoto, K-L; Kishida, S.; Segawa, H.; Nii, T.; Tanaka, H.; Tani, Y.; 

Arai, H.; Tatsumi, S.; Morita, K.; Taketani, Y.; and Takeda, E. (1999) Biochem. J. 343, 705-712]. 

Newly identified consensus sequences of inositol-polyphosphate kinases are represented by [LV]- 

[LA]-[DE]-X(3-8)-P-X-[VAI]-[ML]-D-X-K-[ML]G [Saiardi, A.; Erdument-Bromage, H.; 
15 Snowman, A. M.; Tempst, P.; and Snyder, S. H. (1999) Current Biology 9, 1323-1326]. IP6K2 

catalyzes the transfer of phosphate groups from lnsP6 or lns(l,3,4,5,6)P5 (the substrate), to another 

protein or small molecule, such as a nucleoside di-phosphate. 

The subject invention provides the polypeptides of SEQ ID NOs:415, 310, and 317, 

encoded by the cDNAs of SEQ ID NOs: 174, 69, and 76, respectively. The invention also provides 
20 biologically active fragments of SEQ ID NOs:415, 310, and 317. In one embodiment, the 

polypeptides of SEQ ID NOs:415, 310, and 317 are interchanged with the corresponding 

polypeptides encoded by the human cDNA of clone 188-29-2-0-H1-CS, 188-18-4-0-A9-CS, or 188- 

9-2-0-E1 -CS. "Biologically active fragments" are defined as those peptide or polypeptide 

fragments having at least one of the biological functions of the full length protein (e.g., kinase 
25 activity). Compositions of the protein/polypeptide of SEQ ID NOs:415, 310, or 317, or biologically 

active fragments thereof, are also provided by the subject invention. These compositions may be 

made according to methods well known in the art. 

The invention also provides variants of the protein of SEQ ID NOs:415, 310, or 317. These 

variants have at least about 80%, more preferably at least about 90%, and most preferably at least 
30 about 95% amino acid sequence identity to the amino acid sequences encoded by SEQ ID NOs:415, 

310, and 317. Variants according to the subject invention also have at least one functional or 

structural characteristic of the protein of SEQ ID NOs:415, 310, or 317. The invention also 

provides biologically active fragments of the variant proteins. Compositions of variants, or 

biologically active fragments thereof, are also provided by the subject invention. These 
35 compositions may be made according to methods well known in the art. Unless otherwise 

indicated, the methods disclosed herein can be practiced utilizing the protein encoded by SEQ ED 
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NO:415, 310, or 317, biologically active fragments of SEQ ID NO:415, 310, or 317, variants of 
SEQ ID NO:415, 310, or 317, and biologically active fragments of the variants. 

Because of the redundancy of the genetic code, a variety of different DNA sequences can 
encode the amino acid sequence of SEQ ID NO:415, 310, or 317. In a preferred embodiment, SEQ 
5 ID NO:4 15, 3 10, or 31 7 is encoded by clone 188-29-2-0-H1-CS, 188-18-4-0-A9-CS, or 188-9-2-0- 
El-CS, or by the cDNAs of SEQ ID NO:l 74, 69, or 76. It is well within the skill of a person 
trained in the art to create these alternative DNA sequences which encode proteins having the same, 
or essentially the same, amino acid sequence. These variant DNA sequences are, thus, within the 
scope of the subject invention. As used herein, reference to "essentially the same" sequence refers 

10 to sequences that have amino acid substitutions, deletions, additions, or insertions that do not 
materially affect biological activity. Fragments retaining one or more characteristic biological 
activity of the protein encoded by SEQ ID NO:415, 310, or 317 are also included in this definition. 

In one aspect of the subject invention, SEQ ID NO:415, 310, or 317, and variants thereof, 
can be used to generate polyclonal or monoclonal antibodies. Both biologically active and 

1 5 immunogenic fragments of SEQ ID NO:41 5, 3 1 0, or 3 1 7, or variant proteins, can be used to 

produce antibodies. Polyclonal and/or monoclonal antibodies can be made according to methods 
well known to the skilled artisan. Antibodies produced in accordance with the subject invention can 
be used in a variety of detection assays known to those skilled in the art. The antibodies may be 
used to agonize or antagonize the biological activity of the protein of SEQ ID NO:415, 310, or 317. 

20 The protein of SEQ ID NO:415, 310, or 317 can be used for the synthesis of nucleoside 

triphosphate (NTP) compounds. In one embodiment, the NTP compound produced is ATP, GTP, 
CTP, or TTP. In this aspect of the subject invention, SEQ ID NO:415, 310, or 317 removes a 
phosphate from InsP6 or Ins(l,3,4,5,6)P5 and transfers it to a nucleoside diphosphate (e.g., ADP, 
CTP, GDP, or TDP) to create a NTP. The conditions and methods for the synthesis of NTP 

25 compounds, such as ATP, are well known to the skilled artisan. Thus, the protein of SEQ ID 
NO:415, 310, or 317 has industrially useful function for the synthesis of commercially valuable 
products. 

The subject invention also provides methods of determining the relative amounts of LnsP6 
or Ins(l,3,4,5,6)P5 in the cell by a kinase assay. In this aspect of the invention, SEQ ID NO:415, 
30 310, or 317 can be used to transfer phospate groups from InsP6 or Ins(l,3,4,5,6)P5 to acceptor 
substrates according to well-known kinase activity assays. 

Protein of SEP ID NO:294 (internal designation 181-16-2-0-A7-CS) 

The protein of SEQ ID NO:294 is encoded by the cDNA of SEQ ID NO:53. It will be 
appreciated that all characteristics and uses of the polypeptide of SEQ ID NO:294 described 
35 throughout the present application also pertain to the polypeptide encoded by the human cDNA of 
clone 181-16-2-0-A7-CS. In addition, it will be appreciated that all characteristics and uses of the 
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nucleic acid of SEQ ID NO:53 described throughout the present application also pertain to the 
human cDNA of clone 181-16-2-0-A7-CS. 

This gene was isolated from fetal liver and expression has also been detected in fetal 
kidney, placenta, liver, brain, hypertrophic prostate, salivary gland and testis. Data from PCT 
5 application WO 98/23435 indicate expression is primarily in bone marrow cell lines, and to a lesser 
extent, in human endometrial stromal cells, human adult small intestine and human pancreas tumor. 
PCT application WO 99/14484 reports the fraction of expression in the gastrointestinal system 
(0.227), reproductive system (0.193), and hematopoietic/imrnune system (0.168). Finally, this 
protein is 55% identical and 76% similar to CGI-128 protein, which was isolated from CD34+ cells 

10 and is also found in cell lines from the hematopoietic lineage including, HL6 (granulocytic), Jurkat 
(T-lymophocytic), K562 (erythro-megakaryocytic), and U937 (monocytic). 

Supernatant harvested from cells expressing the product of this gene has been shown to 
increase the permeability of the plasma membrane of renal mesangial cells to calcium. Thus, it is 
believed that the product of this gene is involved in activating a signal transduction pathway when it 

15 binds a receptor on the surface of the plasma membrane of both mesangial cells and other cell types, 
in addition to other cell-lines or tissue cell types. Thus, polynucleotides and polypeptides have 
uses, which include, but are not limited to, activating mesangial cells by contacting said cells with a 
full length polypeptide or a polypeptide fragment which demonstrates this biological activity. 
Further, the polynucleotides and polypeptides can be used in the methods described in W099 15652, 

20 incorporated in its entirety. Binding of a ligand to a receptor is known to alter intracellular levels of 
small molecules, such as calcium, potassium and sodium, as well as alter pH and membrane 
potential. Alterations in small molecule concentration can be measured to identify supernatants, 
which bind to receptors of a particular cell. In addition, when tested against fibroblast cell lines, 
supernatants removed from cells containing this gene activated the EGR1 (early growth response 

25 gene 1) promoter element. Thus, it is likely that this gene activates fibroblast cells through the 
EGR1 signal transduction pathway. EGR1 is a separate signal transduction pathway from Jak- 
STAT, genes containing the EGR1 promoter are induced in various tissues and cell types upon 
activation, leading the cells to undergo differentiation and proliferation (PCT application WO 
98/23435) 

30 Polynucleotide comprising sequences encoding the signal peptide of the protein, e.g. 

VLWLSGLSEPGAA/RQ, can be used in construction of secretion vectors. These vectors would 
then facilitate the secretion of fusion proteins into the media of cells that have been transfected with 
the construct of interest. Antibodies which specifically bine the signal peptide could be used to 
purifiy the fusion protein from the media if desired. 

35 Polynucleotides and polypeptides of the invention are useful as reagents for differential 

identification of the tissue(s) or cell type(s) present in a biological sample and for diagnosis of 
diseases and conditions which include, but are not limited to, haemopoietic and gastrointestinal tract 
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disorders and stromatosis, in addition to endothelial, mucosal, or epithelial cell disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides, are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of disorders of the 
above tissues or cells, particularly of the immune and digestive systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues and cell types (e.g. 
hemaopoietic, immune, reproductive, gastrointestinal, endocrine, and cancerous and wounded 
tissues) or bodily fluids (e.g. lymph, serum, plasma, urine, synovial fluid and spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

10 individual not having the disorder. 

The tissue distributioin in bone marrow cells, fetal liver and fetal kidney, combined with the 
detected calcium flux and EGR1 biological activity, indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune and gastrointestinal tract disorders, and 
stromatosis, particularly tumors and proliferative disorders. More specifically, polynucleotides and 

15 polypeptides corresponding to this gene are useful for the treatment and diagnosis of hematopoietic 
related disorders such as anemia, pancytopenia, leukopenia, thrombocytopenia or leukemia since 
stromal cells are important in the production of cells of hematopoietic lineages. The polypeptides 
and polynucleotides of the invention can be used to enhance hematopoesis as described in 
W09831385, incorporated in its entirety. The uses include bone marrow cell ex vivo culture, bone 

20 marrow transplantation, bone marrow reconstitution, radiotherapy or chemotherapy of neoplasia. 
The gene product may also be involved in lymphopoiesis, therefore, it can be used in immune 
disorders such as infection, inflammation, allergy, immunodeficiency etc. In addition, this gene 
product may have commercial utility in the expansion of stem cells and committed progenitors of 
various blood lineages, and in the differentiation and/or proliferation of various cell types. Protein 

25 as well as, antibodies directed against the protein may show utility as a tumor marker and/or 
immunotherapy targets for the above listed tissues. 

Additionally, since the gene product of 181-16-2 -0-A7-CS has been shown to activate the 
EGR1 promoter element, it likely activates EGR1 signaling activity in fibroblasts. Recent data 
shows that activation of EGR1 plays a role in wound repair. The cellular transcription factor early 

30 growth response factor 1 (Egrl) is expressed minutes after acute injury and serves to stimulate the 
production of a class of growth factors whose role is to promote tissue repair. Egr-1 expression at 
the site of dermal wounding in rodents promotes angiogenesis in vitro and in vivo, increases 
collagen production, and accelerates wound closure. These results show that Egr-1 gene therapy 
accelerates the normal healing process (Human Gene Ther 2000, vol 1 1(15):2143-58). Thus, an 

35 activator of EGR1 signaling, specifically the gene products of 1 81 -1 6-2-0-A7-CS (polypeptides and 
polynucleotides), would be useful in the wound healing process using the methods described in 
W09941282 and W09932135, incorporated by reference in their entireties. 
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Protein of SEP ID NO:305 (internal designation 187-37-0-0-clO-CS) 



The protein of SEQ ID NO:305, encoded by the cDNA of SEQ ID NO:64, is highly 
expressed in the prostate and brain. The protein of the invention is strongly homologous to the D9 
protein, found in both humans (GNP accession number: U95006 and U95007) and in mice (GNP 
5 accession number: U95003, U95004, and U95005). D9 is a myeloid precursor protein transcript 
regulated by the retinoic acid receptor a, hereafter referred to as RAR-a (Scott et al. Blood 1996; 
88:2517-30). 

Retinoic acid is the active metabolite of vitamin A, which contributes to a wide range of 
biological processes such as cellular differentiation, embryogenesis, and tumor suppression. More 

10 specifically, retinoic acid stimulates myeloid precursor differentiation into mature granulocytes. 
For instance, in vitro treatment of acute promyelocyte leukemia blast cells with retinoic acid 
induces their differentiation (Miyauchi et al. Leuk Lymphoma 1999;33:267-80). In addition, 
treatment with retinoic acid can induce disease remission in patients affected with promyelocyte 
leukemia by causing granulocyte precursor differentiation (Slack et al. Ann Hematol 2000;79:227- 

15 38). 

The diverse range of responses to retinoic acid are mediated by three receptor subtypes: 
RAR-a, RAR-P, and RAR-y. RAR-a has been identified as being important for bone marrow 
maturation of granulocytes (Tsai et al. Genes Dev 1992;6:2258-69). In addition, RAR-a is almost 
invariably involved in acute promyelocyte leukemia cells by a reciprocal translocation between the 

20 long arms of chromosomes 15 and 17 (Alcalay et al., Proc Natl Acad Sci USA 1991;88:1977-81). 
This type of leukemia is mainly characterized by a predominance of malignant promyelocytes, and 
severe hemorragic manifestations resulting from activation of the coagulation cascade and the 
fibrinolytic system (Tallman et al. Semin Thromb Hemost 1999;25:209-15). Reciprocal 
chromosomal translocation leads to the production of a fusion protein that inhibits differentiation 

25 and promotes survival of myeloid precursor cells (Grignani et al. Cell 1993;74, 423-431). Transient 
transfection of a vector containing RAR-a in a promyelocyte cell line causes upregulation in an 
early manner of several genes, including D9, which is strongly related to protein of SEQ ID NO:305 
(Scott et al. Blood 1996; 88: 2517-30). Thus, it is believed that the protein of SEQ ID NO:305 is a 
myeloid-related protein whose expression is induced by the activation of retinoic acid receptors, 

30 including RAR-a. 

In a preferred embodiment, the protein of the invention or part thereof may be used to assay 
the activity of RAR-a protein or retinoic acid in a biological sample. Specifically, as the expression 
of the protein is believed to be under the direct control of retinoic acid receptors, the level of the 
protein of the invention, or of the mRNA encoding the protein, can serve as a sensitive and 
35 immediate marker for the effects of retinoic acid upon a cell. An ability to detect retinoic acid 

receptor activation in cells using the present protein has numerous uses. For instance, the protein of 
the invention or part thereof can be used to monitor the effects of retinoic acid on cells of a patient 
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undergoing retinoic acid treatment for promyelocytic leukemia (Slack et al. Ann Hematol 
2000;79:227-38). As retinoic acid treatment is associated with frequent retarded dose-dependant 
side effects, it is believed that an assay based on protein of SEQ ID NO:305 could be used to adjust 
the dose of retinoic acid administered in patients affected with promyelocytic leukemia, in order to 
5 predict and avoid such adverse side-effects (Slack et al. Ann Hematol 2000;79:227-38). 

In another embodiment, the present polypeptides and polynucleotides can be used to 
identify myeloid precursors, as well as brain and prostate tissues. The ability to specifically 
visualize myeloid precursor cells, as well as brain and prostate tissues (and cells derived from the 
tissues), is useful for any of a number of applications, including to determine the origin or identity 

10 of, e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, 
e.g. the evaluation of histological slides. In addition, such assays can be used to examine the extent 
of differentiation in myeloid precursor cells. 

The present invention further relates to in vitro assays and diagnostic kits based on the 
protein of the present invention or part thereof. Such assays may be used for diagnosis of disorders 

15 where the protein activity is abnormally downregulated, such as cancer, and hematological 

disorders including leukemia. As the protein of SEQ ID NO:305, RAR-a, and acute promyelocytic 
leukemia are all related, variation in the measured level of the present protein of the invention or 
part thereof can be used as a diagnostic or screening test for acute promyelocytic leukemia, e.g. 
using a biological sample such as serum or plasma. Further, an assay that can detect an abnormal 

20 level of the protein of the invention or part thereof can be used to detect residual disease in acute 
promyelocytic leukemia. Such an assay may be used to aid therapeutic decisions in this disorder, 
e.g. more or less aggressive treatments, the duration of treatment, etc. 

In another embodiment, various methods can be used to modulate activity and/or expression 
of the protein of SEQ ID NO:305, e.g. for the treatment, attenuation and/or prevention of various 

25 disorders. In one embodiment, any of a number of reagents, e.g. polynucleotides encoding the 
protein of SEQ ED NO:305 or a fragment thereof, the protein of SEQ ED NO.305 itself, or a 
compound that increased the expression or activity of the protein of SEQ ID NO:305, can be 
administered to a patient suffering from, or at risk of developing, various disorders including 
cancer, and hematological diseases such as leukemia, and neutropenia. For instance, but not limited 

30 to it, proteins or other capable of enhancing the expression or activity of the protein of SEQ ID 
NO: 305 can be administered to treat patients affected with acute promyelocytic leukemia, in order 
to induce differentiation of the affected cells into mature granulocytes (Slack et al. Ann Hematol 
2000;79:227-38). In still another preferred embodiment, proteins or other compounds capable of 
increasing the expression or activity of the protein of the invention can be used to treat, prevent 

35 and/or attenuate neutropenia or agranulocytosis patients, in order to induce in vivo differentiation of 
myeloid precursors into mature granulocytes. In still another preferred embodiment, proteins or 
other compounds capable of increasing the expression or activity of the protein of SEQ ID NO:305 

398 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 PCT/IB00/01938 

can be used to treat coagulopathic diseases, such as thrombosis or hemorragic manifestations. For 
instance, they can be used to treat disseminated intravascular coagulation, a severe hemorragic 
syndrome. This embodiment is supported by the fact that acute promyelocyte leukemia is 
frequently associated with disseminated intravascular coagulation (Tallman et al. Semin Thromb 
5 Hemost 1999;25:209-15), and disseminated intravascular coagulation is efficiently corrected with 
retinoic acid (Dombret et al. Leukemia 1993;7:2-9). 

In addition, modulation of the expression or activity of the protein of the invention can be 
used to modulate differentiation of cells, e.g. promyelocyte leukemia. In one such embodiment, the 
protein of the invention is inhibited, e.g. using antisense molecules, antibodies, or small molecule 
10 inhibitors of the expression or activity of the protein, in order to maintain the undifferentiated state 
of cells grown in vitro. Alternatively, agents that increase the expression or activity of the protein 
in cells can be used to induce cellular differentiation, e.g. in the preparation of specific cell types in 
vitro for particular therapeutic applications. 

Protein of SEP ID NO:248 (internal designation 105-03 5-2 -0-C6-CS) and SEP ID NO:313 

15 (internal designation 188-28-4-0-D4-CS) 

The proteins of SEQ ED NP:248, encoded by the cDNA of SEQ ID NP:7, and SEQ ID NP: 
3 1 3, encoded by the cDNA of SEQ ID NP:72, are highly expressed in brain, liver, pancreas, and 
testis. The proteins of the invention are nuclear proteins (Miller et al. J Biol Chem 
2000;275:32052-6) that display a membrane-spanning segment from amino acids 58 to 78. These 

20 proteins are homologous to the human RNA polymerase II elongation factor ELL3 (EMBL. 
accession number AF276512 ; Miller et al. J Biol Chem. 2000; 275:32052-6). In addition, the 
proteins of SEQ ID NP:248 and SEQ ID NP:313 share sequence homology with two other 
members of the polymerase II elongation factor family: ELL, and ELL2. The protein of SEQ ID 
NO:3 13 is similar to the N-terminal sequence the protein of SEQ ED NP:248, but differs after 

25 residue 240 because of a frameshift that produces a premature stop in the sequence SEQ ID NP:72 
(Miller et al. J Biol Chem 2000; 275:32052-6). Additionally, the alignment of the protein of SEQ 
ID NP:248 with occludin, an integral membrane protein found at tight junctions (Furuse et al. J Cell 
Biol 1994; 127:1617-26), reveals that both proteins display a C-terminal ZO-1 binding domain, with 
a 26% homology over a 108 amino acid segment. Protein SEQ ID NP:3 13 lacks this domain, as its 

30 C-terminal region is truncated as compared to the protein of SEQ ID NP:248. ZP-1 is part of the 
family of membrane-associated guanylate kinase homologs (MAGUKs) believed to be important in 
signal transduction originating from sites of cell-cell contact (Willott et al. Proc Natl Acad Sci USA 
1993;90:7834-8). 

The proteins of SEQ ID NPs:248 and 313 are RNA polymerase II elongation factors that 
35 increase the catalytic rate of transcription elongation, a phase during which RNA polymerase II 
moves along the DNA and extends the growing RNA chain (Miller et al. J Biol Chem 2000; 
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275:32052-6). Specifically, the proteins of SEQ ID NOs:248 and 313 suppress transient pausing at 
multiple sites along the DNA, thereby altering the K m and/or the of the polymerase (Miller et 
al. J Biol Chem 2000; 275:32052-6). The present proteins belong to a family that is known to 
include one virally encoded protein (Tat) and six cellular proteins (SIX, P-TEFb, TFIIF, Elongin 
5 (SIII), ELL and ELL2). 

A growing body of evidence suggests that mis-regulation of elongation may be a key 
element in a variety of human diseases (see, Aso et al. J Clin Invest 1996; 97:1561-9). For instance, 
two RNA polymerase II elongation proteins have been implicated in oncogenesis: ELL, which is a 
frequent target for translocation in acute myeloid leukemia (Thirman et ah Proc Natl Acad Sci USA 

10 1994; 91:121 10-4 ; Mitani et al. Blood 1995;85:2017-24), and elongin, which is a transcription 
factor regulated by the product of the von Hippel-Lindau tumor suppressor gene, which is itself 
mutated in the majority of clear-cell renal carcinomas and in families with von Hippel-Lindau 
disease (Duan et al. Science 1995;269:1402-6, Kibel et al. Science 1995; 269:1444-6). In addition, 
overexpression of ELL leads to the transformation of fibroblasts (Kanda et al. J Biol Chem. 1998 

1 5 27; 273:5248-52). Thus, the proteins of SEQ ID NOs:248 and 3 1 3 may be important for 
oncogenesis of multiple types of neoplastic diseases, especially hematological malignancies. 

In one embodiment, the present proteins are used to increase the rate of transcription in 
vitro. Such an increase can be used for any of the large number of in vitro transcription reactions 
which are routinely used for molecular biological applications, e.g. for the preparation of RNA, for 

20 protein production, for the characterization of promoters and transcription factors, etc. 

In another embodiment, the present invention provides diagnostic tools for the detection of 
mutations in the genes encoding SEQ ID NOs:248 or 313. Such mutations may be detected by a 
variety of techniques, including RNase and SI protection assays; alterations in electrophoretic 
mobility of DNA fragments in gels, with or without denaturing agents such as SSCP or DGGE; 

25 dHPLC; and direct DNA sequencing. The detection of mutations in the genes encoding SEQ ID 
NOs:248 or 313 are useful for the detection of a number of diseases and conditions, such as cancers 
and hematological malignancies including leukemia. For example, the RNA polymerase II 
Elongation Factor ELL gene undergoes frequent translocations in acute myeloid leukemia (Thirman 
et al. Proc Natl Acad Sci USA 1994; 91:121 10-4 ; Mitani et al. Blood 1995; 85:2017-24), and it is 

30 likely that other elongation factors are involved in additional such diseases. 

Another embodiment of the present inventions relates to compositions and methods for 
using the proteins or part thereof to specifically visualize myeloid precursor cells, as well as 
pancreas, liver and testis tissues (and cells derived from the tissues). The ability to detect such cell 
types is useful for any of a number of applications, including to determine the origin or identity of, 

35 e.g. cancerous cells, as well as to facilitate the identification of particular cells and tissues for, e.g. 
the evaluation of histological slides. In addition, such methods can be used to examine the extent of 
differentiation in myeloid or myeloid-progenitor cells for staging of leukemia or any other 
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neoplastic disorder. Any method for detecting the presence of the proteins of the invention, or 
nucleic acids encoding the proteins, can be used, including methods involving the use of antibodies 
immunospecific for the proteins of invention. Such antibodies can be used in various methods 
including radioimmunoassays, competitive binding assays, Western Blot analysis and enzyme- 
5 linked immunosorbant assay (ELISA) assays, or any other technique known to those skilled in the 
art. In another embodiment, the present protein or part thereof can be used for the treatment, 
attenuation and/or prevention of conditions associated with unbalanced amounts and/or activity of 
the protein of SEQ ED NO:248 or 313. Other modulatory substances can also be used in such 
embodiments, including chemical compounds such as agonists and antagonists, nucleic acids 

10 including antisense and ribozyme sequences, and antibodies. In a preferred embodiment, such 
substances are employed for the treatment or prevention of certain types of neoplastic disorders 
associated such as cancer or hematological malignancies such as leukemia. In such embodiments, 
where an increased level of expression or activity of the present proteins is correlated with the 
presence of a disease such as cancer, the disease can be treated or prevented using any agent that 

15 can provoke a decrease in the level of activity or expression of the protein, such as antibodies, 

antisense molecules, ribozymes, dominant negative forms of the protein, compounds that inhibit the 
expression or activity of the proteins, and others. Alternatively, in cases where a decreased level of 
expression or activity of the proteins is correlated with the presence of a disease such as cancer, the 
disease can be treated using any agent that can cause an increase in the expression or activity of the 

20 protein, such as polynucleotides encoding the proteins, purified forms of the proteins, or any 
compound that causes an increase in the expression or activity of the proteins. Further, any 
detection of a correlation between the level of expression or activity of the protein and the presence 
or absence of a disease can be used to develop diagnostic or screening tools for the detection of the 
disease itself, or of a predisposition for the disease. 

25 Uses of antibodies 

Antibodies of the present invention have uses that include, but are not limited to, methods 
known in the art to purify, detect, and target the polypeptides of the present invention including 
both in vitro and in vivo diagnostic and therapeutic methods. An example of such use using 
immunoaffinity chromatography is given below. The antibodies of the present invention may be 

30 used either alone or in combination with other compositions. For example, the antibodies have use 
in immunoassays for qualitatively and quantitatively measuring levels of antigen-bearing substances, 
including the polypeptides of the present invention, in biological samples (See, e.g., Harlow et al. 9 
1988). (Incorporated by reference in the entirety). The antibodies may also be used in therapeutic 
compositions for killing cells expressing the protein or reducing the levels of the protein in the body. 

35 The invention further relates to antibodies that act as agonists or antagonists of the 

polypeptides of the present invention. For example, the present invention includes antibodies that 
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disrupt the receptor/ligand interactions with the polypeptides of the invention either partially or 
fully. Included are both receptor-specific antibodies and ligand-specific antibodies. Included are 
receptor-specific antibodies, which do not prevent ligand binding but prevent receptor activation. 
Receptor activation (i.e., signaling) may be determined by techniques described herein or otherwise 
5 known in the art. Also include are receptor-specific antibodies which both prevent ligand binding 
and receptor activation. Likewise, included are neutralizing antibodies that bind the ligand and 
prevent binding of the ligand to the receptor, as well as antibodies that bind the ligand, thereby 
preventing receptor activation, but do not prevent the ligand from binding the receptor. Further 
included are antibodies that activate the receptor. These antibodies may act as agonists for either all 

1 0 or less than all of the biological activities affected by ligand-mediated receptor activation. The 
antibodies may be specified as agonists or antagonists for biological activities comprising specific 
activities disclosed herein. The above antibody agonists can be made using methods known in the 
art. See e.g., WO 96/40281; US Patent 5,81 1,097; Deng et al. (1998); Chen et al. (1998); Harrop et 
al. (1998); Zhu et al. (1998); Yoon et al. (1998); Prat et al. (1998); Pitard et al. (1997); Liautard et 

15 al. (1997); Carlson et al. (1997); Taryman et al. (1995); Muller et al. (1998); Bartunek et al. (1996) 
(said references incorporated by reference in their entireties). 

As discussed above, antibodies of the polypeptides of the invention can, in turn, be utilized 
to generate anti-id iotypic antibodies that "mimic" polypeptides of the invention using techniques 
well known to those skilled in the art (See, e.g. Greenspan and Bona (1989) and Nissinoff (1991), 

20 which disclosures are hereby incorporated by reference in their entireties). For example, antibodies 
which bind to and competitively inhibit polypeptide multimerization or binding of a polypeptide of 
the invention to ligand can be used to generate anti-idiotypes that "mimic" the polypeptide 
multimerization or binding domain and, as a consequence, bind to and neutralize polypeptide or its 
ligand. Such neutralization anti-idiotypic antibodies can be used to bind a polypeptide of the 

25 invention or to bind its ligands/receptors, and thereby block its biological activity. 

Immunoaffinitv Chromatography 

Antibodies prepared as described herein are coupled to a support. Preferably, the antibodies 

are monoclonal antibodies, but polyclonal antibodies may also be used. The support may be any of 

those typically employed in immunoaffinity chromatography, including Sepharose CL-4B 
30 (Pharmacia, Piscataway, NJ), Sepharose CL-2B (Pharmacia, Piscataway, NJ), Affi-gel 10 (Biorad, 

Richmond, CA), or glass beads. 

The antibodies may be coupled to the support using any of the coupling reagents typically 

used in immunoaffinity chromatography, including cyanogen bromide. After coupling the antibody 

to the support, the support is contacted with a sample which contains a target polypeptide whose 
35 isolation, purification or enrichment is desired. The target polypeptide may be a polypeptide 

selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 

402 



01 42451 A2 I > 



WO 01/42451 PCT7IB00/01938 
included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, variants and fragments thereof, or a fusion 
protein comprising said selected polypeptide or a fragment thereof. 

Preferably, the sample is placed in contact with the support for a sufficient amount of time 
5 and under appropriate conditions to allow at least 50% of the target polypeptide to specifically bind 
to the antibody coupled to the support. 

Thereafter, the support is washed with an appropriate wash solution to remove polypeptides 
which have non-specifically adhered to the support. The wash solution may be any of those 
typically employed in immunoaffinity chromatography, including PBS, Tris-lithium chloride buffer 
10 (0.1M lysine base and 0.5M lithium chloride, pH 8.0), Tris-hydrochloride buffer (0.05M Tris- 
hydrochloride, pH 8.0), or Tris/Triton/NaCl buffer (50mM Trisxl, pH 8.0 or 9.0, 0.1% Triton X- 
100, and O.SMNaCl). 

After washing, the specifically bound target polypeptide is eluted from the support using the 
high pH or low pH elution solutions typically employed in immunoaffinity chromatography. In 
15 particular, the elution solutions may contain an eluant such as triethanolamine, diethylamine, 
calcium chloride, sodium thiocyanate, potasssium bromide, acetic acid, or glycine. In some 
embodiments, the elution solution may also contain a detergent such as Triton X-100 or octyl-beta- 
D-glucoside. 

Import vectors 

20 The GENSET polypeptides of the invention may also be used as a carrier to import a 

protein or peptide of interest, so-called cargo, into tissue-culture cells or in host organisms. A 
hydrophobic region of a GENSET polypeptide or a fragment thereof, preferably the signal peptide 
of a sequence selected from the group consisting of of SEQ ID Nos: 1-31 and 33-143 and clones 
inserts of the deposited clone pool, more preferably the short core hydrophobic region (h) of signal 

25 peptides may be used as a carrier. 

When cell permeable peptides of limited size (approximately up to 25 amino acids) are to 
be translocated across cell membrane, chemical synthesis may be used in order to add the h region 
to either the C-terminus or the N-terminus to the cargo peptide of interest. Alternatively, when 
longer peptides or proteins are to be imported into cells, nucleic acids can be genetically engineered, 

30 using techniques familiar to those skilled in the art, in order to link the cDNA sequence or fragment 
thereof encoding the hydrophobic region to the 5' or the 3* end of a DNA sequence coding for a 
cargo polypeptide. Such genetically engineered nucleic acids are then translated either in vitro or in 
vivo after transfection into appropriate cells, using conventional techniques to produce the resulting 
cell permeable polypeptide. Suitable hosts cells are then simply incubated with the cell permeable 

35 polypeptide which is then translocated across the membrane. 
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This method may be applied to study diverse intracellular functions and cellular processes. 
For instance, it has been used to probe functionally relevant domains of intracellular proteins and to 
examine protein-protein interactions involved in signal transduction pathways (Lin et al., J. Biol. 
Chem., 270: 14225-14258 (1995); Lin et al, J. Biol. Chem., 271: 5305-5308 (1996); Rojas et al, J. 
5 Biol. Chem., 271: 27456-27461 (1996); Rojas et al., Nature Biotech., 16: 370-375 (1998); Liu et al., 
Proc. Natl. Acad. ScL USA, 93: 1 1819-1 1824 (1996); Rojas et al, Bioch. Biophys. Res. Commun., 
234: 675-680 (1997) Du et al., J. Peptide Res., 51: 235-243 (1998)). 

Such techniques may be used in cellular therapy to import proteins producing therapeutic 
effects. For instance, cells isolated from a patient may be treated with imported therapeutic proteins 
10 and then re-introduced into the host organism. 

Alternatively, the hydrophobic region of signal peptides of the present invention could be 
used in combination with a nuclear localization signal to deliver nucleic acids into cell nucleus. 
Such oligonucleotides may be antisense oligonucleotides or oligonucleotides designed to form triple 
helixes, as described herein, in order to respectively inhibit processing or maturation of a target 
15 cellular RNA. 

Expression of GENSET products 

Spatial expression of the GENSET genes of the invention 

Tissue expression of the cDNAs of the present invention was examined. Table DC list the 
Genset's libraries of tissues and cell types examined that express the polynucleotides of the present 

20 invention. The tissues and cell types examined for polynucleotide expression were: adrenal gland 
(AG), bone marrow (BM), brain (Br), cancerous protate (CP), cerebellum (Ce), colon (Co), 
dystrophic muscle (DM), fetal brain (FB), fetal kidney (FK), fetal liver (FL), heart (He), 
hypertrophic prostate (HP), kidney (Ki), liver (Li), lung (Lu), lung cells (LC), lymph ganglia (LG), 
lymphocytes (Ly), muscle (Mu), Ovary (Ov), pancreas (Pa), pituitary gland (PG), placenta (PI), 

25 prostate (Pr), salivary gland (SG), spinal cord (SC), spleen (Sp), stomach/intestine (SI), substantia 
nigra (SN), testis (Te), thyroid (Ty), umbilical cord (UC) and uterus (Ut). 

For each cDNA referred to by its sequence identification number (first column), the number 
of proprietary 5'ESTs (i.e. cDNA fragments) expressed in a particular tissue referred to by its name 
is indicated after a semi column (second column). In addition, the bias in the spatial distribution of 

30 the polynucleotide sequences of the present invention was examined by comparing the relative 
proportions of the biological polynucleotides of a given tissue using the following statistical 
analysis. The under- or over-representation of a polynucleotide of a given cluster in a given tissue 
was performed using the normal approximation of the binomial distribution. When the observed 
proportion of a polynucleotide of a given tissue in a given consensus had less than 1% chance to 

35 occur randomly according to the chi2 test, the frequency bias was reported as "preferred". The 

results are given in Table X as follows. For each polynucleotide showing a bias in tissue distribution 

404 



0142451 A2 I > 



WO 01/42451 PCT/IBOO/01938 

as referred to by its sequence identification number in the first column, the list of tissues where the 
polynucleotides are under-represented is given in the second column entitled "low frequency 
expression" and the list of tissues where the polynucleotides are over-represented is given in the 
third column entitled "high frequency expression". 
5 The cellular localization of some polypeptides of the invention was also determined using 

the "psort software" (Nakai, and Horton, (1999); Nakai and Kanehisa, (1992), which disclosures are 
hereby incorporated by reference in their entireties). For each polypeptide identified by its 
sequence identification number in the first column, the second column of Table XI list the predicted 
subcellular localization. 

10 Evaluation of Expression Levels and Patterns of GENSET mRNAs 

The spatial and temporal expression patterns of GENSET mRNAs, as well as their 
expression levels, may also be further determined as follows. 

Expression levels and patterns of GENSET mRNAs may be analyzed by solution 
hybridization with long probes as described in International Patent Application No. WO 97/05277, 

15 the entire contents of which are hereby incorporated by reference. Briefly, a GENSET 
polynucleotide, or fragment thereof corresponding to the gene encoding the mRNA to be 
characterized is inserted at a cloning site immediately downstream of a bacteriophage (T3, T7 or 
SP6) RNA polymerase promoter to produce antisense RNA. Preferably, the GENSET 
polynucleotide is at least a 100 nucleotides in length. The plasmid is linearized and transcribed in 

20 the presence of ribonucleotides comprising modified ribonucleotides (i.e. biotin-UTP and DIG- 
UTP). An excess of this doubly labeled RNA is hybridized in solution with mRNA isolated from 
cells or tissues of interest. The hybridizations are performed under standard stringent conditions 
(40-50°C for 16 hours in an 80% formamide, 0.4 M NaCl buffer, pH 7-8). The unhybridized probe 
is removed by digestion with ribonucleases specific for single-stranded RNA (i.e. RNases CL3, Tl, 

25 Phy M, U2 or A). The presence of the biotin-UTP modification enables capture of the hybrid on a 
microti trati on plate coated with streptavidin. The presence of the DIG modification enables the 
hybrid to be detected and quantified by ELISA using an anti-DIG antibody coupled to alkaline 
phosphatase. 

The GENSET cDNAs, or fragments thereof may also be tagged with nucleotide sequences 
30 for the serial analysis of gene expression (SAGE) as disclosed in UK Patent Application No. 2 305 
241 A, the entire contents of which are incorporated by reference. In this method, cDNAs are 
prepared from a cell, tissue, organism or other source of nucleic acid for which it is desired to 
determine gene expression patterns. The resulting cDNAs are separated into two pools. The 
cDNAs in each pool are cleaved with a first restriction endonuclease, called an "anchoring enzyme," 
35 having a recognition site which is likely to be present at least once in most cDNAs. The fragments 
which contain the 5' or 3' most region of the cleaved cDNA are isolated by binding to a capture 
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medium such as streptavidin coated beads. A first oligonucleotide linker having a first sequence for 
hybridization of an amplification primer and an internal restriction site for a "tagging endonuclease" 
is ligated to the digested cDNAs in the first pool. Digestion with the second endonuclease produces 
short "tag" fragments from the cDNAs. A second oligonucleotide having a second sequence for 
5 hybridization of an amplification primer and an internal restriction site is ligated to the digested 
cDNAs in the second pool. The cDNA fragments in the second pool are also digested with the 
"tagging endonuclease" to generate short "tag" fragments derived from the cDNAs in the second 
pool. The "tags" resulting from digestion of the first and second pools with the anchoring enzyme 
and the tagging endonuclease are ligated to one another to produce "ditags." In some embodiments, 

1 0 the ditags are concatamerized to produce ligation products containing from 2 to 200 ditags. The tag 
sequences are then determined and compared to the sequences of the GENSET cDNAs to determine 
which genes are expressed in the cell, tissue, organism, or other source of nucleic acids from which 
the tags were derived. In this way, the expression pattern of a GENSET gene in the cell, tissue, 
organism, or other source of nucleic acids is obtained. 

15 Quantitative analysis of GENSET gene expression may also be performed using arrays. For 

example, quantitative analysis of gene expression may be performed with GENSET 
polynucleotides, or fragments thereof in a complementary DNA microarray as described by Schena 
et al (1995 and 1996) which disclosures are hereby incorporated by reference in their entireties. 
GENSET cDNAs or fragments thereof are amplified by PCR and arrayed from 96-well microtiter 

20 plates onto silylated microscope slides using high-speed robotics. Printed arrays are incubated in a 
humid chamber to allow rehydration of the array elements and rinsed, once in 0.2% SDS for 1 min, 
twice in water for 1 min and once for 5 min in sodium borohydride solution. The arrays are 
submerged in water for 2 min at 95°C, transferred into 0.2% SDS for 1 min, rinsed twice with 
water, air dried and stored in the dark at 25°C. Cell or tissue mRNA is isolated or commercially 

25 obtained and probes are prepared by a single round of reverse transcription. Probes are hybridized 
to 1 cm 2 microarrays under a 14 x 14 mm glass coverslip for 6-12 hours at 60°C. Arrays are 
washed for 5 min at 25°C in low stringency wash buffer (IX SSC/0.2% SDS), then for 10 min at 
room temperature in high stringency wash buffer (0.1X SSC/0.2% SDS). Arrays are scanned in 
0.1X SSC using a fluorescence laser scanning device fitted with a custom filter set. Accurate 

30 differential expression measurements are obtained by taking the average of the ratios of two 
independent hybridizations. 

Quantitative analysis of the expression of genes may also be performed with GENSET 
cDNAs or fragments thereof in complementary DNA arrays as described by Pietu et al (1996), 
which disclosure is hereby incorporated by reference in its entirety. The GENSET polynucleotides 

35 of the invention or fragments thereof are PCR amplified and spotted on membranes. Then, mRNAs 
originating from various tissues or cells are labeled with radioactive nucleotides. After 
hybridization and washing in controlled conditions, the hybridized mRNAs are detected by 
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phospho-imaging or autoradiography. Duplicate experiments are performed and a quantitative 
analysis of differentially expressed mRNAs is then performed. 

Alternatively, expression analysis of GENSET genes can be done through high density 
nucleotide arrays as described by Lockhart et al. (1996) and Sosnowski et aL (1997), which 
5 disclosures are hereby incorporated by reference in their entireties. Oligonucleotides of 1 5-50 
nucleotides corresponding to sequences of a GENSET polynucleotide or fragments thereof are 
synthesized directly on the chip (Lockhart et al., supra) or synthesized and then addressed to the 
chip (Sosnowski et aL, supra). Preferably, the oligonucleotides are about 20 nucleotides in length. 
cDNA probes labeled with an appropriate compound, such as biotin, digoxigenin or fluorescent dye, 

10 are synthesized from the appropriate rriRNA population and then randomly fragmented to an 
average size of 50 to 100 nucleotides. The said probes are then hybridized to the chip. After 
washing as described in Lockhart et aL, (supra) and application of different electric fields 
(Sosnowsky et aL, supra), the dyes or labeling compounds are detected and quantified. Duplicate 
hybridizations are performed. Comparative analysis of the intensity of the signal originating from 

15 cDNA probes on the same target oligonucleotide in different cDNA samples indicates a differential 
expression of the GENSET mRNA. 

Uses of GENSET expression data 

Once the expression levels and patterns of a GENSET mRNA has been determined using 
any technique known to those skilled in the art, in particular those described in the section entitled 
20 "Evaluation of Expression Levels and Patterns of GENSET mRNAs", or using the instant 
disclosure, these information may be used to design GENSET specific markers for detection, 
identification, screening and diagnosis purposes as well as to design DNA constructs with an 
expression pattern similar to a GENSET expression pattern. 

Detection of GENSET expression and/or biological activity 

25 The invention further relates to methods of detection of GENSET expression and/or 

biological activity in a biological sample using the polynucleotide and polypeptide sequences 
described herein. Such method scan be used, for example, as a screen for normal or abnormal 
GENSET expression and/or biological activity and, thus, can be used diagnostically. The biological 
sample for use in the methods of the present invention includes a suitable sample from, for example, 

30 a mammal, particularly a human. For example, the sample can be issued from tissues or cell lines 
having the same origin as tissues or cell lines in which the polypeptide is known to be expressed 
using the data from Table DC. 

Detection of GENSET products 

The invention further relates to methods of detection of GENSET polynucleotides or 
35 polypeptides in a sample using the sequences described herein and any techniques known to those 
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skilled in the art. For example, a labeled polynucleotide probe having all or a functional portion of 
the nucleotide sequence of a GENSET polynucleotide can be used in a method to detect a GENSET 
polynucleotide in a sample. In one embodiment, the sample is treated to render the polynucleotides 
in the sample available for hybridization to a polynucleotide probe, which can be DNA or RNA. 
5 The resulting treated sample is combined with a labeled polynucleotide probe having all or a portion 
of the nucleotide sequence of the GENSET cDNA or genomic sequence, under conditions 
appropriate for hybridization of complementary sequences to occur. Detection of hybridization of 
polynucleotides from the sample with the labeled nucleic probe indicates the presence of GENSET 
polynucleotides in a sample. The presence of GENSET mRNA is indicative of GENSET 
10 expression. 

Consequently, the invention comprises methods for detecting the presence of a 
polynucleotide comprising a nucleotide sequence selected from a group consisting of the sequences 
of SEQ ID Nos: 1-241, the sequences of clone inserts of the deposited clone pool, sequences fully 
complementary thereto, fragments and variants thereof in a sample. In a first embodiment, said 
15 method comprises the following steps of: 

a) bringing into contact said sample and a nucleic acid probe or a plurality of nucleic acid 
probes which hybridize to said selected nucleotide sequence; and 

b) detecting the hybrid complex formed between said probe or said plurality of probes and 
said polynucleotide. 

20 In a preferred embodiment of the above detection method, said nucleic acid probe or said 

plurality of nucleic acid probes is labeled with a detectable molecule. In another preferred 
embodiment of the above detection method, said nucleic acid probe or said plurality of nucleic acid 
probes has been immobilized on a substrate. In still another preferred embodiment, said nucleic 
acid probe or said plurality of nucleic acid probes has a sequence comprised in a sequence 

25 complementary to said selected sequence. 

In a second embodiment, said method comprises the following steps of: 
a) contacting said sample with amplification reaction reagents comprising a pair of 
amplification primers located on either side of the region of said nucleotide sequence to be 
amplified; 

30 b) performing an amplification reaction to synthesize amplification products containing said 

region of said selected nucleotide sequence; and 

c) detecting said amplification products. 

In a preferred embodiment of the above detection method, when the polynucleotide to be 
amplified is a RNA molecule, preliminary reverse transcription and synthesis of a second cDNA 
35 strand are necessary to provide a DNA template to be amplified. In another preferred embodiment 
of the above detection method, the amplification product is detected by hybridization with a labeled 
probe having a sequence which is complementary to the amplified region. In still another preferred 
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embodiment, at least one of said amplification primer has a sequence comprised in said selected 
sequence or in the sequence complementary to said selected sequence. 

Alternatively, a method of detecting GENSET expression in a test sample can be 
accomplished using any product which binds to a GENSET polypeptide of the present invention or 
5 a portion of a GENSET polypeptide. Such products may be antibodies, binding fragments of 
antibodies, polypeptides able to bind specifically to GENSET polypeptides or fragments thereof, 
including GENSET agonists and antagonists. Detection of specific binding to the antibody indicates 
the presence of a GENSET polypeptide in the sample (e.g., ELISA). 

Consequently, the invention is also directed to a method for detecting specifically the 
10 presence of a GENSET polypeptide according to the invention in a biological sample, said method 
comprising the following steps of: 

a) bringing into contact said biological sample with a product able to bind to a polypeptide 
of the invention or fragments thereof; 

b) allowing said product to bind to said polypeptide to form a complex; and 
1 5 b) detecting said complex. 

In a preferred embodiment of the above detection method, the product is an antibody. In a 
more preferred embodiment, said antibody is labeled with a detectable molecule. In another more 
preferred embodiment of the above detection method, said antibody has been immobilized on a 
substrate. 

20 In addition, the invention also relates to methods of determining whether a GENSET 

product (e.g. a polynucleotide or polypeptide) is present or absent in a biological sample, said 
methods comprising the steps of: 

a) obtaining said biological sample from a human or non-human animal, preferably a 
mammal; 

25 b) contacting said biological sample with a product able to bind to a GENSET 

polynucleotide or polypeptide of the invention; and 

c) determining the presence or absence of said GENSET product in said biological sample. 
Compounds that specifically binds a GENSET product may either be compounds binding to 

a GENSET polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 
30 fragments) or compounds bindint to a GENSET polynucleotide (e.g. a complementary probe or 
primer). 

The present invention also relates to kits that can be used in the detection of GENSET 
expression products. The kit can comprise a compound that specifically binds a GENSET 
polypeptide (e.g. binding proteins, antibodies or binding fragments thereof (e.g. F(ab')2 fragments) 
35 or a GENSET mRNA (e.g. a complementary probe or primer), for example, disposed within a 
container means. The kit can further comprise ancillary reagents, including buffers and the like. 

Detection of a GENSET biological activity 
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The invention further includes methods of detecting specifically a GENSET biological 
activity. Assessing the GENSET biological activity may be performed using a variety of 
techniques, including those described in the section entitled "Erreur! Source du renvoi 
introuvable.". 

5 Consequently, the invention is directed to a method for detecting specifically GENSET 

biological activity in a biological sample, said method comprising the following steps: 

a) obtaining a biological sample from a human or non-human mammal; and 

b) detecting a GENSET biological activity. 

The present invention also relates to kits that can be used in the detection of GENSET 
10 biological activity. 

Identification of a specific context of GENSET expression 

When the expression pattern of a GENSET mRNA shows that a GENSET gene is 
specifically expressed in a given context, probes and primers specific for this gene as well as 
antibodies binding to the GENSET polynucleotide may then be used as markers for a specific 

15 context. Examples of specific contexts are: specific expression in a given tissue/cell or tissue/cell 
type, expression at a given stage of development of a process such as embryo development or 
disease development, or specific expression in a given organelle. Such primers, probes, and 
antibodies are useful commercially to identify tissues/cells/organelles of unknown origin, for 
example, forensic samples, differentiated tumor tissue that has metastasized to foreign bodily sites, 

20 or to differentiate different tissue types in a tissue cross-section using any technique known to those 
skilled in the art including in situ PCR or immunochemistry for example. 

For example, the cDNAs and proteins of the sequence listing and fragments thereof, may be 
used to distinguish human tissues/cells from non-human tissues/cells and to distinguish between 
human tissues/cells/organelles that do and do not express the polynucleotides comprising the 

25 cDNAs. By knowing the expression pattern of a given GENSET, either through routine 

experimentation or by using the instant disclosure, the polynucleotides and polypeptides of the 
present invention may be used in methods of determining the identity of an unknown tissue/cell 
sample/organelle. As part of determining the identity of an unknown tissue/cell sample/organelle, 
the polynucleotides and polypeptides of the present invention may be used to determine what the 

30 unknown tissue/cell sample is and what the unknown sample is not. For example, if a cDNA is 
expressed in a particular tissue/cell type/organelle, and the unknown tissue/cell sample/organelle 
does not express the cDNA, it may be inferred that the unknown tissue/cells are either not human or 
not the same human tissue/cell type/organelle as that which expresses the cDNA. These methods of 
determining tissue/cell/organelle identity are based on methods which detect the presence or 

35 absence of the mRNA (or corresponding cDNA) in a tissue/cell sample using methods well know in 
the art (e.g., hybridization, PCR based methods, immunoassays, immunochemistry, ELISA). 
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Examples of such techniques are described in more detail below. Therefore, the invention 
encompasses uses of the polynucleotides and polypeptides of the invention as tissue markers. In a 
preferred embodiment, polynucleotides preferentially expressed in given tissues as indicated in 
Table X and polypeptides encoded by such polynucleotides are used for this purpose. The 
5 invention also encompasses uses of polypeptides of the invention as organelle markers. In a 
preferred embodiment, polypeptides preferentially expressed in given subcellular compartment as 
indicated in Table XI are used for this purpose. 



Consequently, the present invention encompasses methods of identification of a tissue/cell 
10 type/subcellular compartment, wherein said method includes the steps of: 

a) contacting a biological sample which identity is to be assayed with a product able to bind 
a GENSET product; and 

b) determining whether a GENSET product is expressed in said biological sample. 
Products that are able to bind specifically to a GENSET product, namely a GENSET 

1 5 polypeptide or a GENSET mRNA, include GENSET binding proteins, antibodies or binding 

fragments thereof (e.g. F(ab , )2 fragments), as well as GENSET complementary probes and primers. 

Step b) may be performed using any detection method known to those skilled in the art 
including those disclosed herein, especially in the section entitled "Detection of GENSET 
expression and/or biological activity".. 

20 Identification of Tissue Types or Cell Species by Means of Labeled Tissue Specific Antibodies 
Identification of specific tissues is accomplished by the visualization of tissue specific 
antigens by means of antibody preparations which are conjugated, directly (e.g., green fluorescent 
protein) or indirectly to a detectable marker. Selected labeled antibody species bind to their specific 
antigen binding partner in tissue sections, cell suspensions, or in extracts of soluble proteins from a 

25 tissue sample to provide a pattern for qualitative or semi-qualitative interpretation. 

Antisera for these procedures must have a potency exceeding that of the native preparation, 
and for that reason, antibodies are concentrated to a mg/ml level by isolation of the gamma globulin 
fraction, for example, by ion-exchange chromatography or by ammonium sulfate fractionation. 
Also, to provide the most specific antisera, unwanted antibodies, for example to common proteins, 

30 must be removed from the gamma globulin fraction, for example by means of insoluble 

immunoabsorbents, before the antibodies are labeled with the marker. Either monoclonal or 
heterologous antisera is suitable for either procedure. 

A. Immunohistochemical Techniques 

Purified, high-titer antibodies, prepared as described above, are conjugated to a detectable 

35 marker, as described, for example, by Fudenberg, (1980) or Rose et ai, (1980), which disclosures 

are hereby incorporated by reference in their entireties. 
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A fluorescent marker, either fluorescein or rhodamine, is preferred, but antibodies can also 
be labeled with an enzyme that supports a color producing reaction with a substrate, such as 
horseradish peroxidase. Markers can be added to tissue-bound antibody in a second step, as 
described below. Alternatively, the specific anti-tissue antibodies can be labeled with ferritin or 
5 other electron dense particles, and localization of the ferritin coupled antigen-antibody complexes 
achieved by means of an electron microscope. In yet another approach, the antibodies are 
radiolabeled, with, for example ,25 I, and detected by overlaying the antibody treated preparation 
with photographic emulsion. Preparations to carry out the procedures can comprise monoclonal or 
polyclonal antibodies to a single protein or peptide identified as specific to a tissue type, for 

10 example, brain tissue, or antibody preparations to several antigenically distinct tissue specific 
antigens can be used in panels, independently or in mixtures, as required. Tissue sections and cell 
suspensions are prepared for immunohistochemical examination according to common histological 
techniques. Multiple cryostat sections (about 4 um, unfixed) of the unknown tissue and known 
control, are mounted and each slide covered with different dilutions of the antibody preparation. 

15 Sections of known and unknown tissues should also be treated with preparations to provide a 
positive control, a negative control, for example, pre-immune sera, and a control for non-specific 
staining, for example, buffer. Treated sections are incubated in a humid chamber for 30 min at 
room temperature, rinsed, then washed in buffer for 30-45 min. Excess fluid is blotted away, and 
the marker developed. If the tissue specific antibody was not labeled in the first incubation, it can 

20 be labeled at this time in a second antibody-antibody reaction, for example, by adding fluorescein- 
or enzyme-conjugated antibody against the immunoglobulin class of the antiserum-producing 
species, for example, fluorescein labeled antibody to mouse IgG. Such labeled sera are 
commercially available. The antigen found in the tissues by the above procedure can be quantified 
by measuring the intensity of color or fluorescence on the tissue section, and calibrating that signal 

25 using appropriate standards. 

B. Identification of Tissue Specific Soluble Proteins 

The visualization of tissue specific proteins and identification of unknown tissues from that 
procedure is carried out using the labeled antibody reagents and detection strategy as described for 
immunohistochemistry; however the sample is prepared according to an electrophoretic technique 

30 to distribute the proteins extracted from the tissue in an orderly array on the basis of molecular 

weight for detection. A tissue sample is homogenized using a Virtis apparatus; cell suspensions are 
disrupted by Dounce homogenization or osmotic lysis, using detergents in either case as required to 
disrupt cell membranes, as is the practice in the art. Insoluble cell components such as nuclei, 
microsomes, and membrane fragments are removed by ultracentrifugation, and the soluble protein- 

35 containing fraction concentrated if necessary and reserved for analysis. A sample of the soluble 
protein solution is resolved into individual protein species by conventional SDS polyacrylamide 
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electrophoresis as described, for example, by Davis et aL, Section 19-2 (1986), using a range of 
amounts of polyacrylamide in a set of gels to resolve the entire molecular weight range of proteins 
to be detected in the sample. A size marker is run in parallel for purposes of estimating molecular 
weights of the constituent proteins. Sample size for analysis is a convenient volume of from 5 to55 
5 ul, and containing from about 1 to 100 ug protein. An aliquot of each of the resolved proteins is 
transferred by blotting to a nitrocellulose filter paper, a process that maintains the pattern of 
resolution. Multiple copies are prepared. The procedure, known as Western Blot Analysis, is well 
described in Davis et aL, (1986) Section 19-3. One set of nitrocellulose blots is stained with 
Coomassie Blue dye to visualize the entire set of proteins for comparison with the antibody bound 

10 proteins. The remaining nitrocellulose filters are then incubated with a solution of one or more 
specific antisera to tissue specific proteins prepared as described herein. In this procedure, as in 
procedure A above, appropriate positive and negative sample and reagent controls are run. 

In either procedure A or B, a detectable label can be attached to the primary tissue antigen- 
primary antibody complex according to various strategies and permutations thereof. In a 

15 straightforward approach, the primary specific antibody can be labeled; alternatively, the unlabeled 
complex can be bound by a labeled secondary anti-IgG antibody. In other approaches, either the 
primary or secondary antibody is conjugated to a biotin molecule, which can, in a subsequent step, 
bind an avidin conjugated marker. According to yet another strategy, enzyme labeled or radioactive 
protein A, which has the property of binding to any IgG, is bound in a final step to either the 

20 primary or secondary antibody. The visualization of tissue specific antigen binding at levels above 
those seen in control tissues to one or more tissue specific antibodies, prepared from the gene 
sequences identified from cDNA sequences, can identify tissues of unknown origin, for example, 
forensic samples, or differentiated tumor tissue that has metastasized to foreign bodily sites. 

Targeting of compounds to subcellular compartments 

25 GENSET Polypeptides expressed in specific cellular compartrnents/organelels may also be 

used to target compounds to these cornpartments/organelles. The invention therefore encompasses 
uses of polypeptides and polynucleotides of the invention as organelle targeting tools. 

In a first embodiment, GENSET polypeptides expressed in mitochondria may be used to 
target heterologous compounds, either polypeptides or polynucleotides to mitochondria by 

30 recombinantly or chemically fusing a fragment of the protein of the invention to an heterologous 
polypeptide or polynucleotide. Preferred fragments are signal peptide, amphiphilic alpha helices 
and/or any other fragments of the protein of the invention, or part thereof, that may contain 
targeting signals for mitochondria including but not limited to matrix targeting signals as defined in 
Herrman and Neupert, (2000); Bhagwat et aL (1999), Murphy (1997); Glaser et aL (1998); 

35 Ciminale et aL (1999), which disclosures are hereby incorporated by reference in their entireties. 
Such heterologous compounds may be used to modulate mitochondria's activities. For example, 
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they may be used to induce and/or prevent mitochondrial-induced apoptosis or necrosis. In 
addition, heterologous polynucleotides may be used for mitochondrial gene therapy to replace a 
defective mitochondrial gene and/or to inhibit the deleterious expression of a mitochondrial gene. 

In a second embodiment, GENSET polypeptides expressed in the endoplasmic reticulum may 
5 be used to target heterologous polypeptides to the endoplasmic reticulum by recombinantly or 

chemically fusing a fragment of the proteins of the invention to an heterologous polypeptide. Preferred 
fragments are any fragments of the proteins of the invention, or part thereof, that may contain targeting 
signals for the endoplasmic reticulum such as those described in Pidoux and Armstrong (1992), Munro 
and Pelham (1987); Pelham (1990), which disclosures are hereby incorporated by reference in their 
10 entireties. 

In a third embodiment, GENSET polypeptides expressed in the nucleus may be used to target 
heterologous polypeptides or polynucleotides to the nucleus by recombinantly or chemically fusing a 
fragment of the proteins or polynuleotide of the invention to an heterologous polypeptide or 
polynucleotide. Preferred fragments are any fragments of the proteins or polynuclotide of the 

1 5 invention, or part thereof, that may contain targeting signals for the nucleus (nuclear localization 

signals) such as those described in Christophe et al. ( 2000), which disclosure is hereby incorporated by 
reference in its entirety. 

In a fourth embodiment, GENSET polypeptides expressed in the nucleus may be used to 
target heterologous polypeptides to the Golgi apparatus by recombinantly or chemically fusing a 

20 fragment of the protein of the invention to an heterologous polypeptide. Preferred fragments are 
signal peptide, transmembrane domains, tyrosine containing regions and/or any other fragments of 
the proteins of the invention, or part thereof, that may contain (1) targeting signals for the Golgi 
apparatus such as the ones described in Ugur and Jones, (2000); Picetti and Borrelli, (2000), (2) 
tyrosine-based Golgi targeting signal region (Zhan et al., (1998); Watson and Pessin (2000); Ward 

25 and Moss (2000), or (3) any other region as defined in Munro, (1998); Luetterforst et al., (1999); 
Essl et al., (1999), which disclosures are hereby incorporated by reference in their entireties. 

Screening and diagnosis of abnormal GENSET expression and/or biological activity 

Moreover, antibodies and/or primers specific for GENSET expression may also be used to 
identify abnormal GENSET expression and/or biological activity, and subsequently to screen and/or 

30 diagnose disorders associated with abnormal GENSET expression. For example, a particular 
disease may result from lack of expression, over expression, or under expression of a GENSET 
mRNA. By comparing mRNA expression patterns and quantities in samples taken from healthy 
individuals with those from individuals suffering from a particular disorder, genes responsible for 
this disorder may be identified. Primers, probes and antibodies specific for this GENSET may then 

35 be used to elaborate kits of screening and diagnosis for a disorder in which the gene of interest is 
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specifically expressed or in which its expression is specifically dysregulated, i.e. underexpressed or 
overexpressed. 



Screening for specific disorders 

The present invention also relates to methods of identifying individuals having elevated or 
5 reduced levels of GENSET, which individuals are likely to benefit from therapies to suppress or 
enhance GENSET expression, respectively. One example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence in said sample of a GENSET product (mRNA or protein) using 
any method known to those skilled in the art including those described herein, especially at the 

1 0 section entitled "Detection of GENSET products"; 

c) comparing the amount of said GENSET product present in said sample with that of a 
control sample; and 

d) determing whether said human or non-human mammal has a reduced or elevated level of 
GENSET expression compared to the control sample. 

15 Such individuals with reduced or elevated levels of GENSET products may be predisposed 

to disorders associated with dyregulation of GENSET gene expression and thus would be 
candidates for therapies. The identification of elevated levels of GENSET in a patient would be 
indicative of an individual that would benefit from treatment with agents that suppress GENSET 
expression or activity. The identification of low levels of GENSET in a patient would be indicative 

20 of an individual that would benefit from agents that induce GENSET expression or activity. 

Biological samples suitable for use in this method include biological fluids such as blood, 
lymph, saliva, sperm, maternal milk, and tissue samples (e.g. biopsies ) as well as cell cultures or 
cell extracts derived, for example, from tissue biopsies. The detection step of the present method 
can be performed using standard protocols for protein/mRNA detection. Examples of suitable 

25 protocols include Northern blot analysis, immunoassays (e.g. R1A, Western blots, 
immunohistochemical analyses), and PCR. 

Thus, the present invention further relates to methods of identifying individuals or non- 
human animals at increased risk for developing, or present state of having, certain 
diseases/disorders associated with GENSET abnormal expression or biological activity. One 

30 example of such methods comprises the steps of: 

a) obtaining from a human or non-human mammal a biological sample; 

b) detecting the presence or absence in said sample of a GENSET product (mRNA or 
protein); 

c) comparing the amount of said GENSET product present in said sample with that of a 
35 control sample; and 
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d) determing whether said human or non-human mammal is at increased risk for 
developing, or present state of having, a diseases or disorder. 

In accordance with this method, the presence in the sample of altered levels of GENSET 
product indicates that the subject is predisposed to the above-indicated diseases/disorders. 
5 Biological samples suitable for use in this method include biological fluids such as blood, lymph, 
saliva, sperm, maternal milk, and tissue samples (e.g. biopsies. 

The diagnostic methodologies described herein are applicable to both humans and 
non-human mammals. 

Detection of GENSET mutations 
10 The invention also encompasses methods to detect mutations in GENSET polynucleotides 

of the invention. Such methods may advantageously be used to detect mutations occurring in 
GENSET genes and preferably in their regulatory regions. When the mutation was proven to be 
associated with a disease, screening for such mutations may be used for screening and diagnosis 
purposes. 

15 In one embodiment of the oligonucleotide arrays of the invention, an oligonucleotide probe 

matrix may advantageously be used to detect mutations occurring in GENSET genes and preferably 
in their regulatory regions. For this particular purpose, probes are specifically designed to have a 
nucleotide sequence allowing their hybridization to the genes that carry known mutations (either by 
deletion, insertion or substitution of one or several nucleotides). By known mutations, it is meant, 

20 mutations on the GENSET genes that have been identified according, for example to the technique 
used by Huang et a/.(1996) or Samson et a/.(1996), which disclosures are hereby incorporated by 
reference in their entireties. 

Another technique that is used to detect mutations in GENSET genes is the use of a high- 
density DNA array. Each oligonucleotide probe constituting a unit element of the high density 

25 DNA array is designed to match a specific subsequence of a GENSET genomic DNA or cDNA. 
Thus, an array consisting of oligonucleotides complementary to subsequences of the target gene 
sequence is used to determine the identity of the target sequence with the wild gene sequence, 
measure its amount, and detect differences between the target sequence and the reference wild gene 
sequence of the GENSET gene. In one such design, termed 4L tiled array, is implemented a set of 

30 four probes (A, C, G, T), preferably 1 5-nucleotide oligomers. In each set of four probes, the perfect 
complement will hybridize more strongly than mismatched probes. Consequently, a nucleic acid 
target of length L is scanned for mutations with a tiled array containing 4L probes, the whole probe 
set containing all the possible mutations in the known wild reference sequence. The hybridization 
signals of the 15-mer probe set tiled array are perturbed by a single base change in the target 

35 sequence. As a consequence, there is a characteristic loss of signal or a "footprint" for the probes 
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flanking a mutation position. This technique was described by Chee et al. in 1996, which disclosure 
is hereby incorporated by reference in its entirety. 



Construction of DNA constructs with a GENSET expression pattern 

In addition, characterization of the spatial and temporal expression patterns and expression 
5 levels of GENSET mRNAs is also useful for constructing expression vectors capable of producing a 
desired level of gene product in a desired spatial or temporal manner, as discussed below. 

DNA Construct That Enables Directing Temporal And Spatial GENSET Gene Expression In 
Recombinant Cell Hosts And In Transgenic Animals. 

In order to study the physiological and phenotypic consequences of a lack of synthesis of a 

10 GENSET protein, both at the cell level and at the multi cellular organism level, the invention also 
encompasses DNA constructs and recombinant vectors enabling a conditional expression of a 
specific allele of a GENSET genomic sequence or cDNA and also of a copy of this genomic 
sequence or cDNA harboring substitutions, deletions, or additions of one or more bases as regards 
to a nucleotide sequence selected from the group consisting of sequences of SEQ ID Nos 1-241 and 

15 sequences of clone inserts of the deposited clone pool, or a fragment thereof, these base 

substitutions, deletions or additions being located either in an exon, an intron or a regulatory 
sequence, but preferably in the 5*-regulatory sequence or in an exon of the GENSET genomic 
sequence or within the GENSET cDNA. 

A first preferred DNA construct is based on the tetracycline resistance operon tet from E. 

20 coli transposon TnlO for controlling the GENSET gene expression, such as described by Gossen et 
a/. (1992, 1995) and Furth et al (1994), which disclosures are hereby incorporated by reference in 
their entireties. Such a DNA construct contains seven tet operator sequences from TnlO (te/op) that 
are fused to either a minimal promoter or a 5 '-regulatory sequence of the GENSET gene, said 
minimal promoter or said GENSET regulatory sequence being operably linked to a polynucleotide 

25 of interest that codes either for a sense or an antisense oligonucleotide or for a polypeptide, 

including a GENSET polypeptide or a peptide fragment thereof. This DNA construct is functional 
as a conditional expression system for the nucleotide sequence of interest when the same cell also 
comprises a nucleotide sequence coding for either the wild type (tTA) or the mutant (rTA) repressor 
fused o the activating domain of viral protein VP 16 of herpes simplex virus, placed under the 

30 control of a promoter, such as the HCMVIE1 enhancer/promoter or the MMTV-LTR. Indeed, a 
preferred DNA construct of the invention comprise both the polynucleotide containing the tet 
operator sequences and the polynucleotide containing a sequence coding for the tTA or the rTA 
repressor. In a specific embodiment, the conditional expression DNA construct contains the 
sequence encoding the mutant tetracycline repressor rTA, the expression of the polynucleotide of 

35 interest is silent in the absence of tetracycline and induced in its presence. 
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DNA Constructs Allowing Homologous Recombination: Replacement Vectors 

A second preferred DNA construct will comprise, from 5'-end to 3 '-end: (a) a first 
nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide sequence 
comprising a positive selection marker, such as the marker for neomycine resistance (neo); and (c) a 
5 second nucleotide sequence that is comprised in the GENSET genomic sequence, and is located on 
the genome downstream the first GENSET nucleotide sequence (a). 

In a preferred embodiment, this DNA construct also comprises a negative selection marker 
located upstream the nucleotide sequence (a) or downstream the nucleotide sequence (c). 
Preferably, the negative selection marker comprises the thymidine kinase (tk) gene (Thomas et al. 9 
10 1986), the hygromycine beta gene (Te Riele et al, 1990), the hprt gene ( Van der Lugt et al, 1991 ; 
Reid et al, 1990) or the Diphteria toxin A fragment (Dt-A) gene (Nada et al., 1993; Yagi et 
al. 1990), which disclosures are hereby incorporated by reference in their entireties. Preferably, the 
positive selection marker is located within a GENSET exon sequence so as to interrupt the sequence 
encoding a GENSET protein. These replacement vectors are described, for example, by Thomas et 
15 o/.(1986; 1987), Mansour et a/.(1988) and Roller et a/.(1992). 

The first and second nucleotide sequences (a) and (c) may be indifferently located within a 
GENSET regulatory sequence, an intronic sequence, an exon sequence or a sequence containing 
both regulatory and/or intronic and/or exon sequences. The size of the nucleotide sequences (a) and 
(c) ranges from 1 to 50 kb, preferably from 1 to 10 kb, more preferably from 2 to 6 kb and most 
20 preferably from 2 to 4 kb. 

DNA Constructs Allowing Homologous Recombination: Cre-LoxP System. 

These new DNA constructs make use of the site specific recombination system of the PI 
phage. The PI phage possesses a recombinase called Cre which interacts specifically with a 34 
base pairs lox? site. The lox? site is composed of two palindromic sequences of 13 bp separated by 

25 a 8 bp conserved sequence (Hoess et al, 1986), which disclosure is hereby incorporated by 

reference in its entirety. The recombination by the Cre enzyme between two lox? sites having an 
identical orientation leads to the deletion of the DNA fragment. 

The Cre-/oJcP system used in combination with a homologous recombination technique has 
been first described by Gu et a/.(1993, 1994), which disclosures are hereby incorporated by 

30 reference in their entireties. Briefly, a nucleotide sequence of interest to be inserted in a targeted 
location of the genome harbors at least two lox? sites in the same orientation and located at the 
respective ends of a nucleotide sequence to be excised from the recombinant genome. The excision 
event requires the presence of the recombinase (Cre) enzyme within the nucleus of the recombinant 
cell host. The recombinase enzyme may be brought at the desired time either by (a) incubating the 

35 recombinant cell hosts in a culture medium containing this enzyme, by injecting the Cre enzyme 
directly into the desired cell, such as described by Araki et <z/.(1995), which disclosure is hereby 
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incorporated by reference in its entirety, or by lipofection of the enzyme into the cells, such as 
described by Baubonis et ah (1993), which disclosure is hereby incorporated by reference in its 
entirety; (b) transfecting the cell host with a vector comprising the Cre coding sequence operably 
linked to a promoter functional in the recombinant cell host, which promoter being optionally 
5 inducible, said vector being introduced in the recombinant cell host, such as described by Gu et 
a/.(1993) and Sauer et a/.(1988), which disclosures are hereby incorporated by reference in their 
entireties; (c) introducing in the genome of the cell host a polynucleotide comprising the Cre coding 
sequence operably linked to a promoter functional in the recombinant cell host, which promoter is 
optionally inducible, and said polynucleotide being inserted in the genome of the cell host either by 
10 a random insertion event or an homologous recombination event, such as described by Gu et 
a/.(1994). 

In a specific embodiment, the vector containing the sequence to be inserted in the GENSET 
gene by homologous recombination is constructed in such a way that selectable markers are flanked 
by lox? sites of the same orientation, it is possible, by treatment by the Cre enzyme, to eliminate the 

15 selectable markers while leaving the GENSET sequences of interest that have been inserted by an 
homologous recombination event. Again, two selectable markers are needed: a positive selection 
marker to select for the recombination event and a negative selection marker to select for the 
homologous recombination event. Vectors and methods using the Cre-/oxP system are described by 
Zou et a/.(1994), which disclosure is hereby incorporated by reference in its entirety. 

20 Thus, a third preferred DNA construct of the invention comprises, from 5'-end to 3 '-end: 

(a) a first nucleotide sequence that is comprised in the GENSET genomic sequence; (b) a nucleotide 
sequence comprising a polynucleotide encoding a positive selection marker, said nucleotide 
sequence comprising additionally two sequences defining a site recognized by a recombinase, such 
as a loxP site, the two sites being placed in the same orientation; and (c) a second nucleotide 

25 sequence that is comprised in the GENSET genomic sequence, and is located on the genome 
downstream of the first GENSET nucleotide sequence (a). 

The sequences defining a site recognized by a recombinase, such as a loxP site, are 
preferably located within the nucleotide sequence (b) at suitable locations bordering the nucleotide 
sequence for which the conditional excision is sought. In one specific embodiment, two loxP sites 

30 are located at each side of the positive selection marker sequence, in order to allow its excision at a 
desired time after the occurrence of the homologous recombination event. 

In a preferred embodiment of a method using the third DNA construct described above, the 
excision of the polynucleotide fragment bordered by the two sites recognized by a recombinase, 
preferably two loxP sites, is performed at a desired time, due to the presence within the genome of 

35 the recombinant host cell of a sequence encoding the Cre enzyme operably linked to a promoter 
sequence, preferably an inducible promoter, more preferably a tissue-specific promoter sequence 
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and most preferably a promoter sequence which is both inducible and tissue-specific, such as 
described by Gu et al. ( 1 994). 

The presence of the Cre enzyme within the genome of the recombinant cell host may result 
from the breeding of two transgenic animals, the first transgenic animal bearing the GENSET- 
5 derived sequence of interest containing the lox? sites as described above and the second transgenic 
animal bearing the Cre coding sequence operably linked to a suitable promoter sequence, such as 
described by Gu et al (1994). 

Spatio-temporal control of the Cre enzyme expression may also be achieved with an 
adenovirus based vector that contains the Cre gene thus allowing infection of cells, or in vivo 
10 infection of organs, for delivery of the Cre enzyme, such as described by Anton and Graham (1995) 
and Kanegae et a/.(1995), which disclosures are hereby incorporated by reference in their entireties. 

The DNA constructs described above may be used to introduce a desired nucleotide 
sequence of the invention, preferably a GENSET genomic sequence or a GENSET cDNA sequence, 
and most preferably an altered copy of a GENSET genomic or cDNA sequence, within a 
15 predetermined location of the targeted genome, leading either to the generation of an altered copy of 
a targeted gene (knock-out homologous recombination) or to the replacement of a copy of the 
targeted gene by another copy sufficiently homologous to allow an homologous recombination 
event to occur (knock-in homologous recombination). 

Modifying GENSET expression and/or biological activity 

20 Modifying endogenous GENSET expression and/or biological activity is expressly 

contemplated by the present invention. 

Screening for compounds that modulate GENSET expression and/or biological activity 

The present invention further relates to compounds able to modulate GENSET expression 
and/or biological activity and methods to use these compounds. Such compounds may interact with 
25 the regulatory sequences of GENSET genes or they may interact with GENSET polypeptides 
directly or indirectly. 

Compounds Interacting With GENSET Regulatory Sequences 

The present invention also concerns a method for screening substances or molecules that are 
able to interact with the regulatory sequences of a GENSET gene, such as for example promoter or 
30 enhancer sequences in untranscribed regions of the genomic DNA, as determined using any 
techniques known to those skilled in the art including those described in the section entitled 
"Identification of Promoters in Cloned Upstream Sequences, or such as regulatory sequences 
located in untranslated regions of GENSET mRNA. 

Sequences within untranscribed or untranslated regions of polynucleotides of the invention 
35 may be identified by comparison to databases containing known regulatory sequence such as 
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transcription start sites, transcription factor binding sites, promoter sequences, enhancer sequences, 
5'UTR and 3'UTR elements (Pesole et ai, 2000; http://igs-server.cnrs- 

mrs.fr/-gauthere/UTR/index.htm]). Alternatively, the regulatory sequences of interest may be 
identified through conventional mutagenesis or deletion analyses of reporter plasmids using, for 
5 instance, techniques described in the section entitled "Identification of Promoters in Cloned 
Upstream Sequences". 

Following the identification of potential GENSET regulatory sequences, proteins which 
interact with these regulatory sequences may be identified as described below. 

Gel retardation assays may be performed independently in order to screen candidate 

10 molecules that are able to interact with the regulatory sequences of the GENSET gene, such as 

described by Fried and Crothers (1981), Garner and Revzin (1981) and Dent and Latchman (1993), 
the teachings of these publications being herein incorporated by reference. These techniques are 
based on the principle according to which a DNA or mRNA fragment which is bound to a protein 
migrates slower than the same unbound DNA or mRNA fragment. Briefly, the target nucleotide 

15 sequence is labeled. Then the labeled target nucleotide sequence is brought into contact with either 
a total nuclear extract from cells containing regulation factors, or with different candidate molecules 
to be tested. The interaction between the target regulatory sequence of the GENSET gene and the 
candidate molecule or the regulation factor is detected after gel or capillary electrophoresis through 
a retardation in the migration. 

20 Nucleic acids encoding proteins which are able to interact with the promoter sequence of 

the GENSET gene, more particularly a nucleotide sequence selected from the group consisting of 
the polynucleotides of the 5' and 3' regulatory region or a fragment or variant thereof, may be 
identified by using a one-hybrid system, such as that described in the booklet enclosed in the 
Matchmaker One-Hybrid System kit from Clontech (Catalog Ref. n° K 1603-1), the technical 

25 teachings of which are herein incorporated by reference. Briefly, the target nucleotide sequence is 
cloned upstream of a selectable reporter sequence and the resulting polynucleotide construct is 
integrated in the yeast genome (Saccharomyces cerevisiae). Preferably, multiple copies of the 
target sequences are inserted into the reporter plasmid in tandem. The yeast cells containing the 
reporter sequence in their genome are then transformed with a library comprising fusion molecules 

30 between cDNAs encoding candidate proteins for binding onto the regulatory sequences of the 

GENSET gene and sequences encoding the activator domain of a yeast transcription factor such as 
GAL4. The recombinant yeast cells are plated in a culture broth for selecting cells expressing the 
reporter sequence. The recombinant yeast cells thus selected contain a fusion protein that is able to 
bind onto the target regulatory sequence of the GENSET gene. Then, the cDNAs encoding the 

35 fusion proteins are sequenced and may be cloned into expression or transcription vectors in vitro. 
The binding of the encoded polypeptides to the target regulatory sequences of the GENSET gene 
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may be confirmed by techniques familiar to the one skilled in the art, such as gel retardation assays 
or DNAse protection assays. 



Ligands interacting with GENSET polypeptides 

For the purpose of the present invention, a ligand means a molecule, such as a protein, a 
5 peptide, an antibody or any synthetic chemical compound capable of binding to a GENSET protein 
or one of its fragments or variants or to modulate the expression of the polynucleotide coding for 
GENSET or a fragment or variant thereof. 

In the ligand screening method according to the present invention, a biological sample or a 
defined molecule to be tested as a putative ligand of a GENSET protein is brought into contact with 

10 the corresponding purified GENSET protein, for example the corresponding purified recombinant 
GENSET protein produced by a recombinant cell host as described herein, in order to form a 
complex between this protein and the putative ligand molecule to be tested. 

As an illustrative example, to study the interaction of a GENSET protein, or a fragment 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 

15 preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 
group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, with drugs or small molecules, such as molecules generated 
through combinatorial chemistry approaches, the microdialysis coupled to HPLC method described 

20 by Wang et aL (1997) or the affinity capillary electrophoresis method described by Bush et al. 
(1997), the disclosures of which are incorporated by reference, can be used. 

In further methods, peptides, drugs, fatty acids, lipoproteins, or small molecules which 
interact with a GENSET protein, or a fragment comprising a contiguous span of at least 6 amino 
acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 

25 100 amino acids of a polypeptide selected from the group consisting of sequences of SEQ ED Nos: 
242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as well as full-length 
and mature polypeptides encoded by the clone inserts of the deposited clone pool may be identified 
using assays such as the following. The molecule to be tested for binding is labeled with a 
detectable label, such as a fluorescent , radioactive, or enzymatic tag and placed in contact with 

30 immobilized GENSET protein, or a fragment thereof under conditions which permit specific 

binding to occur. After removal of non-specifically bound molecules, bound molecules are detected 
using appropriate means. 

Various candidate substances or molecules can be assayed for interaction with a GENSET 
polypeptide. These substances or molecules include, without being limited to, natural or synthetic 

35 organic compounds or molecules of biological origin such as polypeptides. When the candidate 
substance or molecule comprises a polypeptide, this polypeptide may be the resulting expression 
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product of a phage clone belonging to a phage-based random peptide library, or alternatively the 
polypeptide may be the resulting expression product of a cDNA library cloned in a vector suitable 
for performing a two-hybrid screening assay. 

A. Candidate ligands obtained from random peptide libraries 
5 In a particular embodiment of the screening method, the putative ligand is the expression 

product of a DNA insert contained in a phage vector (Parmley and Smith, 1988). Specifically, 
random peptide phages libraries are used. The random DNA inserts encode for peptides of 8 to 20 
amino acids in length (Oldenburg et al y 1992; Valadon et aL, 1996; Lucas, 1994; Westerink, 1995; 
Felici et al., 1991), which disclosures are hereby incorporated by reference in their entireties. 

10 According to this particular embodiment, the recombinant phages expressing a protein that binds to 
an immobilized GENSET protein is retained and the complex formed between the GENSET protein 
and the recombinant phage may be subsequently immunoprecipitated by a polyclonal or a 
monoclonal antibody directed against the GENSET protein. 

Once the ligand library in recombinant phages has been constructed, the phage population is 

15 brought into contact with the immobilized GENSET protein. Then the preparation of complexes is 
washed in order to remove the non-specifically bound recombinant phages. The phages that bind 
specifically to the GENSET protein are then eluted by a buffer (acid pH) or immunoprecipitated by 
the monoclonal antibody produced by the hybridoma anti-GENSET, and this phage population is 
subsequently amplified by an over-infection of bacteria (for example E. coli). The selection step 

20 may be repeated several times, preferably 2-4 times, in order to select the more specific 

recombinant phage clones. The last step comprises characterizing the peptide produced by the 
selected recombinant phage clones either by expression in infected bacteria and isolation, 
expressing the phage insert in another host-vector system, or sequencing the insert contained in the 
selected recombinant phages. 

25 B. Candidate ligands obtained by competition experiments. 

Alternatively, peptides, drugs or small molecules which bind to a GENSET protein or 
fragment thereof comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 
amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 
selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 

30 included in SEQ ID Nos: 242-272 and 274-384, as well as full-length and mature polypeptides 
encoded by the clone inserts of the deposited clone pool, may be identified in competition 
experiments. In such assays, the GENSET protein, or a fragment thereof, is immobilized to a 
surface, such as a plastic plate. Increasing amounts of the peptides, drugs or small molecules are 
placed in contact with the immobilized GENSET protein, or a fragment thereof, in the presence of a 

35 detectable labeled known GENSET protein ligand. For example, the GENSET ligand may be 

detectably labeled with a fluorescent, radioactive, or enzymatic tag. The ability of the test molecule 
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to bind the GENSET protein, or a fragment thereof, is determined by measuring the amount of 
detectably labeled known ligand bound in the presence of the test molecule. A decrease in the 
amount of known ligand bound to the GENSET protein, or a fragment thereof, when the test 
molecule is present indicated that the test molecule is able to bind to the GENSET protein, or a 
5 fragment thereof. 

C. Candidate ligands obtained by affinity chromatography. 

Proteins or other molecules interacting with a GENSET protein, or a fragment thereof 
comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino acids, more 
preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the 

10 group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID 
Nos: 242-272 and 274-384, as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool, can also be found using affinity columns which contain the 
GENSET protein, or a fragment thereof. The GENSET protein, or a fragment thereof, may be 
attached to the column using conventional techniques including chemical coupling to a suitable 

15 column matrix such as agarose, Affi Gel® , or other matrices familiar to those of skill in art. In 
some embodiments of this method, the affinity column contains chimeric proteins in which the 
GENSET protein, or a fragment thereof, is fused to glutathion S transferase (GST). A mixture of 
cellular proteins or pool of expressed proteins as described above is applied to the affinity column. 
Proteins or other molecules interacting with the GENSET protein, or a fragment thereof, attached to 

20 the column can then be isolated and analyzed on 2-D electrophoresis gel as described in Ramunsen 
et al. (1997), the disclosure of which is incorporated by reference. Alternatively, the proteins 
retained on the affinity column can be purified by electrophoresis based methods and sequenced. 
The same method can be used to isolate antibodies, to screen phage display products, or to screen 
phage display human antibodies. 

25 D. Candidate ligands obtained by optical biosensor methods 

Proteins interacting with a GENSET protein, or a fragment comprising a contiguous span of 
at least 6 amino acids, preferably at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 
25, 30, 40, 50, or 100 amino acids of a polypeptide selected from the group consisting of sequences 
of SEQ ID Nos: 242-482, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, as 

30 well as full-length and mature polypeptides encoded by the clone inserts of the deposited clone 
pool, can also be screened by using an Optical Biosensor as described in Edwards and 
Leatherbarrow (1997) and also in Szabo et aL (1995), the disclosures of which are incorporated by 
reference. This technique permits the detection of interactions between molecules in real time, 
without the need of labeled molecules. This technique is based on the surface plasmon resonance 

35 (SPR) phenomenon. Briefly, the candidate ligand molecule to be tested is attached to a surface 
(such as a carboxymethyl dextran matrix). A light beam is directed towards the side of the surface 
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that does not contain the sample to be tested and is reflected by said surface. The SPR phenomenon 
causes a decrease in the intensity of the reflected light with a specific association of angle and 
wavelength. The binding of candidate ligand molecules cause a change in the refraction index on 
the surface, which change is detected as a change in the SPR signal. For screening of candidate 
5 ligand molecules or substances that are able to interact with the GENSET protein, or a fragment 
thereof, the GENSET protein, or a fragment thereof, is immobilized onto a surface. This surface 
comprises one side of a cell through which flows the candidate molecule to be assayed. The 
binding of the candidate molecule on the GENSET protein, or a fragment thereof, is detected as a 
change of the SPR signal. The candidate molecules tested may be proteins, peptides, carbohydrates, 

10 lipids, or small molecules generated by combinatorial chemistry. This technique may also be 
performed by immobilizing eukaryotic or prokaryotic cells or lipid vesicles exhibiting an 
endogenous or a recombinantly expressed GENSET protein at their surface. 

The main advantage of the method is that it allows the determination of the association rate 
between the GENSET protein and molecules interacting with the GENSET protein. It is thus 

1 5 possible to select specifically ligand molecules interacting with the GENSET protein, or a fragment 
thereof, through strong or conversely weak association constants. 

E. Candidate ligands obtained through a two-hybrid screening assay. 

The yeast two-hybrid system is designed to study protein-protein interactions in vivo (Fields 
and Song, 1989), which disclosure is hereby incorporated by reference in its entirety, and relies 
20 upon the fusion of a bait protein to the DNA binding domain of the yeast Gal4 protein. This 

technique is also described in the US Patent N° US 5,667,973 and the US Patent N° 5,283,173, the 
technical teachings of both patents being herein incorporated by reference. 

The general procedure of library screening by the two-hybrid assay may be performed as 
described by Harper et al. (1993) or as described by Cho et al. (1998) or also Fromont-Racine et aL 
25 (1997), which disclosures are hereby incorporated by reference in their entireties. 

The bait protein or polypeptide comprises, consists essentially of, or consists of a GENSET 
polypeptide or a fragment thereof comprising a contiguous span of at least 6 amino acids, preferably 
at least 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids 
of a polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature 
30 polypeptides included in SEQ ED Nos: 242-272 and 274-384, as well as full-length and mature 
polypeptides encoded by the clone inserts of the deposited clone pool. 

More precisely, the nucleotide sequence encoding the GENSET polypeptide or a fragment 
or variant thereof is fused to a polynucleotide encoding the DNA binding domain of the GAL4 
protein, the fused nucleotide sequence being inserted in a suitable expression vector, for example 
35 pAS2orpM3. 
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Then, a human cDNA library is constructed in a specially designed vector, such that the 
human cDNA insert is fused to a nucleotide sequence in the vector that encodes the transcriptional 
domain of the GAL4 protein. Preferably, the vector used is the pACT vector. The polypeptides 
encoded by the nucleotide inserts of the human cDNA library are termed "pray" polypeptides. 
5 A third vector contains a detectable marker gene, such as beta galactosidase gene or CAT 

gene that is placed under the control of a regulation sequence that is responsive to the binding of a 
complete Gal4 protein containing both the transcriptional activation domain and the DNA binding 
domain. For example, the vector pG5EC may be used. 

Two different yeast strains are also used. As an illustrative but non limiting example the 
10 two different yeast strains may be the followings : 

- Y190, the phenotype of which is (MATa, Leu2-3, 1 12 ura3-12, trpl-901, his3-D200, 
ade2-101, gal4Dgall80D URA3 GAL-LacZ, LYS GAL-HIS3, cyh*); 

- Y187, the phenotype of which is (MATa gal4 gal 80 his3 trpl-901 ade2-101 ura3-52 leu2- 
3, -1 12 URA3 GAL-lacZmet), which is the opposite mating type of Y190. 

1 5 Briefly, 20 ng of pAS2/GENSET and 20 \xg of pACT-cDNA library are co-transformed 

into yeast strain Y190. The transformants are selected for growth on minimal media lacking 
histidine, leucine and tryptophan, but containing the histidine synthesis inhibitor 3-AT (50 mM). 
Positive colonies are screened for beta galactosidase by filter lift assay. The double positive 
colonies (His\ beta-gal + ) are then grown on plates lacking histidine, leucine, but containing 

20 tryptophan and cycloheximide (10 mg/ml) to select for loss of pAS2/GENSET plasmids but 

retention of pACT-cDNA library plasmids. The resulting Yl 90 strains are mated with Y187 strains 
expressing GENSET or non-related control proteins; such as cyclophilin B, lamin, or SNF1, as Gal4 
fusions as described by Harper et al. (1993) and by Bram et al. (1993), which disclosures are hereby 
incorporated by reference in their entireties, and screened for beta galactosidase by filter lift assay. 

25 Yeast clones that are beta gal- after mating with the control Gal4 fusions are considered false 
positives. 

In another embodiment of the two-hybrid method according to the invention, interaction 
between the GENSET or a fragment or variant thereof with cellular proteins may be assessed using 
the Matchmaker Two Hybrid System 2 (Catalog No. K 1604-1, Clontech). As described in the 

30 manual accompanying the kit, the disclosure of which is incorporated herein by reference, nucleic 
acids encoding the GENSET protein or a portion thereof, are inserted into an expression vector such 
that they are in frame with DNA encoding the DNA binding domain of the yeast transcriptional 
activator GAL4. A desired cDNA, preferably human cDNA, is inserted into a second expression 
vector such that they are in frame with DNA encoding the activation domain of GAL4. The two 

35 expression plasmids are transformed into yeast and the yeast are plated on selection medium which 
selects for expression of selectable markers on each of the expression vectors as well as GAL4 
dependent expression of the H1S3 gene. Transformants capable of growing on medium lacking 
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histidine are screened for GAL4 dependent lacZ expression. Those cells which are positive in both the 
histidine selection and the lacZ assay contain interaction between GENSET and the protein or peptide 
encoded by the initially selected cDNA insert. 

Compounds Modulating GENSET biological activity 
5 Another method of screening for compounds that modulate GENSET gene expression 

and/or biological activity is by measuring the effects of test compounds on a given cellular property 
in a host cell, such as apoptosis, proliferation, differentiation, protein glycosylation, etc... using a 
variety of techniques known to those skilled in the art including those described herein and 
especially in the section entitled "Erreur! Source du renvoi introuvable.". 

10 In one embodiment, the present invention relates to a method of identifying an agent which 

alters GENSET activity, wherein a nucleic acid construct comprising a nucleic acid which encodes 
a mammalian GENSET polypeptide is introduced into a host cell. The host cells produced are 
maintained under conditions appropriate for expression of the encoded mammalian GENSET 
polypeptides, whereby the nucleic acid is expressed. The host cells are then contacted with a 

15 compound to be assessed (an agent) and the given cellular property of the cells is detected in the 
presence of the compound to be assessed. Detection of a change in the given cellular property in 
the presence of the agent indicates that the agent alters GENSET activity. 

In a particular embodiment, the invention relates to a method of identifying an agent which 
is an activator of GENSET activity, wherein detection of a change of the given cellular property in 

20 the presence of the agent indicates that the agent activates GENSET activity. In another particular 
embodiment, the invention relates to a method of identifying an agent which is an inhibitor of 
GENSET activity, wherein detection of a change of the given cellular property in the presence of 
the agent indicates that the agent inhibits GENSET activity. 

Methods of Screening for Compounds Modulating GENSET Expression and/or Activity 
25 The present invention also relates to methods of screening compounds for their ability to 

modulate (e.g. increase or inhibit) the activity or expression of GENSET. More specifically, the 
present invention relates to methods of testing compounds for their ability either to increase or to 
decrease expression or activity of GENSET. The assays are performed in vitro or in vivo. 

In vitro methods 

30 In vitro, cells expressing GENSET are incubated in the presence and absence of the test 

compound. By determining the level of GENSET expression in the presence of the test compound 
or the level of GENSET activity in the presence of the test compound, compounds can be identified 
that suppress or enhance GENSET expression or activity. Alternatively, constructs comprising a 
GENSET regulatory sequence operably linked to a reporter gene (e.g. luciferase, chloramphenicol 

35 acetyl transferase, LacZ, green fluorescent protein, etc.) can be introduced into host cells and the 
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effect of the test compounds on expression of the reporter gene detected. Cells suitable for use in 
the foregoing assays include, but are not limited to, cells having the same origin as tissues or cell 
lines in which the polypeptide is known to be expressed using the data from Table DC. 

Consequently, the present invention encompasses a method for screening molecules that 
5 modulate the expression of a GENSET gene, said screening method comprising the steps of: 

a) cultivating a prokaryotic or an eukaryotic cell that has been transfected with a nucleotide 
sequence encoding a GENSET protein or a variant or a fragment thereof, placed under the control 
of its own promoter; 

b) bringing into contact said cultivated cell with a molecule to be tested; 

10 c) quantifying the expression of said GENSET protein or a variant or a fragment thereof in 

the presence of said molecule. 

Using DNA recombination techniques well known by the one skill in the art, the GENSET 
protein encoding DNA sequence is inserted into an expression vector, downstream from its 
promoter sequence. As an illustrative example, the promoter sequence of the GENSET gene is 
15 contained in the 5' untranscribed region of the GENSET genomic DNA. 

The quantification of the expression of a GENSET protein may be realized either at the 
mRNA level (using for example Northen blots, RT-PCR, preferably quantitative RT-PCR with 
primers and probes specific for the GENSET mRNA of interest) or at the protein level (using 
polyclonal or monoclonal antibodies in immunoassays such as ELISA or RIA assays, Western blots, 
20 or immunochemistry). 

The present invention also concerns a method for screening substances or molecules that are 
able to increase, or in contrast to decrease, the level of expression of a GENSET gene. Such a 
method may allow the one skilled in the art to select substances exerting a regulating effect on the 
expression level of a GENSET gene and which may be useful as active ingredients included in 
25 pharmaceutical compositions for treating patients suffering from disorders associated with abnormal 
levels of GENSET products. 

Thus, also part of the present invention is a method for screening a candidate molecule that 
modulates the expression of a GENSET gene, this method comprises the following steps: 

a) providing a recombinant cell host containing a nucleic acid, wherein said nucleic acid 
30 comprises a GENSET 5' regulatory region or a regulatory active fragment or variant thereof, 

operably linked to a polynucleotide encoding a detectable protein; 

b) obtaining a candidate molecule; and 

c) determining the ability of said candidate molecule to modulate the expression levels of 
said polynucleotide encoding the detectable protein. 

35 In a further embodiment, said nucleic acid comprising a GENSET 5' regulatory region or a 

regulatory active fragment or variant thereof, includes the 5'UTR region of a GENSET cDNA 
selected from the group comprising of the 5'UTRs of the sequences of SEQ ID Nos 1-241, 
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sequences of clones inserts of the deposited clone pool, regulatory active fragments and variants 
thereof. In a more preferred embodiment of the above screening method, said nucleic acid includes 
a promoter sequence which is endogenous with respect to the GENSET 5'UTR sequence. In 
another more preferred embodiment of the above screening method, said nucleic acid includes a 
5 promoter sequence which is exogenous with respect to the GENSET 5'UTR sequence defined 
therein. 

Preferred polynucleotides encoding a detectable protein are polynucleotides encoding beta 
galactosidase, green fluorescent protein (GFP) and chloramphenicol acetyl transferase (CAT). 

The invention further relates to a method for the production of a pharmaceutical 
10 composition comprising a method of screening a candidate molecule that modulates the expression 
of a GENSET gene and furthermore mixing the identified molecule with a pharmaceutically 
acceptable carrier. 

The invention also pertains to kits for the screening of a candidate substance modulating the 
expression of a GENSET gene. Preferably, such kits comprise a recombinant vector that allows the 
15 expression of a GENSET 5' regulatory region or a regulatory active fragment or a variant thereof, 
operably linked to a polynucleotide encoding a detectable protein or a GENSET protein or a 
fragment or a variant thereof. More preferably, such kits include a recombinant vector that 
comprises a nucleic acid including the 5'UTR region of a GENSET cDNA selected from the group 
comprising the 5'UTRs of the sequences of SEQ ID Nos 1-241, sequences of clones inserts of the 
20 deposited clone pool, regulatory active fragments and variants thereof, being operably linked to a 
polynucleotide encoding a detectable protein. 

For the design of suitable recombinant vectors useful for performing the screening methods 
described above, it will be referred to the section of the present specification wherein the preferred 
recombinant vectors of the invention are detailed. 
25 Another object of the present invention comprises methods and kits for the screening of 

candidate substances that interact with a GENSET polypeptide, fragments or variants thereof. By 
their capacity to bind covalently or non-covalently to a GENSET protein, fragments or variants 
thereof, these substances or molecules may be advantageously used both in vitro and in vivo. 

In vitro, said interacting molecules may be used as detection means in order to identify the 
30 presence of a GENSET protein in a sample, preferably a biological sample. 

A method for the screening of a candidate substance that interact with a GENSET 
polypeptide, fragments or variants thereof, said methods comprising the following steps: 

a) providing a polypeptide comprising, consisting essentially of, or consisting of a GENSET 
protein or a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 
35 10 amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a 

polypeptide selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides 
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included in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool; 

b) obtaining a candidate substance; 

c) bringing into contact said polypeptide with said candidate substance; 

5 d) detecting the complexes formed between said polypeptide and said candidate substance. 

The invention further relates to a method for the production of a pharmaceutical 
composition comprising a method for the screening of a candidate substance that interact with a 
GENSET polypeptide, fragments or variants thereof and furthermore mixing the identified 
substance with a pharmaceutically acceptable carrier. 
10 The invention further concerns a kit for the screening of a candidate substance interacting 

with the GENSET polypeptide, wherein said kit comprises: 

a) a polypeptide comprising, consisting essentially of, or consisting of a GENSET protein or 
a fragment comprising a contiguous span of at least 6 amino acids, preferably at least 8 to 10 amino 
acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of a polypeptide 

15 selected from the group consisting of sequences of SEQ ID Nos: 242-482, mature polypeptides included 
in SEQ ID Nos: 242-272 and 274-384 as well as full-length and mature polypeptides encoded by the clone 
inserts of the deposited clone pool; and 

b) optionally means useful to detect the complex formed between said polypeptide or a 
variant thereof and the candidate substance. 

20 In a preferred embodiment of the kit described above, the detection means comprises a 

monoclonal or polyclonal antibody binding to said GENSET protein or fragment or variant thereof. 

In vivo methods 

Compounds that suppress or enhance GENSET expression can also be identified using in 
vivo screens. In these assays, the test compound is administered (e.g. IV, IP, IM, orally, or 

25 otherwise), to the animal, for example, at a variety of dose levels. The effect of the compound on 
GENSET expression is determined by comparing GENSET levels, for example in tissues known to 
express the gene of interest using, for example the data obtained in Table IX, and using Northern 
blots, immunoassays, PCR, etc., as described above. Suitable test animals include rodents (e.g., 
mice and rats), primates, mammals. Humanized mice can also be used as test animals, that is mice 

30 in which the endogenous mouse protein is ablated (knocked out) and the homologous human 

protein added back by standard transgenic approaches. Such mice express only the human form of 
a protein. Humanized mice expressing only the human GENSET can be used to study in vivo 
responses to potential agents regulating GENSET protein or mRNA levels. As an example, 
transgenic mice have been produced carrying the human apoE4 gene. They are then bred with a 

35 mouse line that lacks endogenous apoE, to produce an animal model carrying human proteins 

believed to be instrumental in development of Alzheimer's pathology. Such transgenic animals are 

useful for dissecting the biochemical and physiological steps of disease, and for development of 
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therapies for disease intervention (Loring, et al, 1996) (incorporated herein by reference in its 
entirety). 



Uses for compounds modulating GENSET expression and/or biological activity 

Using in vivo (or in vitro) systems, it may be possible to identify compounds that exert a 
5 tissue specific effect, for example, that increase GENSET expression or activity only in tissues of 
interest. Screening procedures such as those described above are also useful for identifying agents 
for their potential use in pharmacological intervention strategies. Agents that enhance GENSET 
expression or stimulate its activity may thus be used to treat disorders which require upregulated 
levels of GENSET gene expression and/or activity. Compounds that suppress GENSET expression 

10 or inhibit its activity can be used to treat disorders which require downregulated levels of GENSET 
gene expression and/or activity. 

Also encompassed by the present invention is an agent which interacts with GENSET 
directly or indirectly, and inhibits or enhances GENSET expression and/or function. In one 
embodiment, the agent is an inhibitor which interferes with GENSET directly (e.g., by binding 

1 5 GENSET) or indirectly (e.g., by blocking the ability of GENSET to have a GENSET biological 
activity). In a particular embodiment, an inhibitor of GENSET protein is an antibody specific for 
GENSET protein or a functional portion of GENSET; that is, the antibody binds a GENSET 
polypeptide. For example, the antibody can be specific for a polypeptide encoded by one of the 
amino acid sequences of human GENSET genes (SEQ ID Nos: 242-482, mature polypeptides 

20 included in SEQ ID Nos: 242-272 and 274-384, full-length and mature polypeptides encoded by the 
clone inserts of the deposited clone pool), mammal GENSET or portions thereof. Alternatively, the 
inhibitor can be an agent other than an antibody (e.g., small organic molecule, protein or peptide) 
which binds GENSET and blocks its activity. For example, the inhibitor can be an agent which 
mimics GENSET structurally, but lacks its function. Alternatively, it can be an agent which binds 

25 to or interacts with a molecule which GENSET normally binds with or interacts with, thus blocking 
GENSET from doing so and preventing it from exerting the effects it would normally exert. 

In another embodiment, the agent is an enhancer (activator) of GENSET which increases 
the activity of GENSET (increases the effect of a given amount or level of GENSET), increases the 
length of time it is effective (by preventing its degradation or otherwise prolonging the time during 

30 which it is active) or both either directly or indirectly. 

The GENSET sequences of the present invention can also be used to generate nonhuman 
gene knockout animals, such as mice, which lack a GENSET gene or transgenically overexpress 
GENSET. For example, such GENSET gene knockout mice can be generated and used to obtain 
further insight into the function of GENSET as well as assess the specificity of GENSET activators 

35 and inhibitors. Also, over expression of GENSET (e.g., human GENSET) in transgenic mice can 
be used as a means of creating a test system for GENSET activators and inhibitors (e.g., against 
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human GENSET). In addition, the GENSET gene can be used to clone the GENSET 
promoter/enhancer in order to identify regulators of GENSET transcription. GENSET gene 
knockout animals include animals which completely or partially lack the GENSET gene and/or 
GENSET activity or function. Thus the present invention relates to a method of inhibiting (partially 
5 or completely) GENSET biological activty in a mammal (e.g., human) comprising administering to 
the mammal an effective amount of an inhibitor of GENSET. The invention also relates to a 
method of enhancing GENSET biological activity in a mammal comprising administering to the 
mammal an effective amount of an enhancer GENSET. 

Inhibiting GENSET expression 
10 Therapeutic compositions according to the present invention may comprise advantageously 

one or several GENSET oligonucleotide fragments as an antisense tool or a triple helix tool that 
inhibits the expression of the corresponding GENSET gene. 

Antisense Approach 

In antisense approaches, nucleic acid sequences complementary to an mRNA are hybridized 
15 to the mRNA intracellularly, thereby blocking the expression of the protein encoded by the mRNA. 
The antisense nucleic acid molecules to be used in gene therapy may be either DNA or RNA 
sequences. Preferred methods using antisense polynucleotide according to the present invention are 
the procedures described by Sczakiel et a/.(1995), which disclosure is hereby incorporated by 
reference in its entirety. 

20 Preferably, the antisense tools are chosen among the polynucleotides (15-200 bp long) that 

are complementary to GENSET mRNA, more preferably to the 5'end of the GENSET mRNA. In 
another embodiment, a combination of different antisense polynucleotides complementary to 
different parts of the desired targeted gene are used. 

Other preferred antisense polynucleotides according to the present invention are sequences 

25 complementary to either a sequence of GENSET mRNAs comprising the translation initiation 

codon ATG or a sequence of GENSET genomic DNA containing a splicing donor or acceptor site. 

Preferably, the antisense polynucleotides of the invention have a 3' polyadenylation signal 
that has been replaced with a self-cleaving ribozyme sequence, such that RNA polymerase II 
transcripts are produced without poly(A) at their 3' ends* these antisense polynucleotides being 

30 incapable of export from the nucleus, such as described by Liu et a/.(1994), which disclosure is 
hereby incorporated by reference in its entirety. In a preferred embodiment, these GENSET 
antisense polynucleotides also comprise, within the ribozyme cassette, a histone stem-loop structure 
to stabilize cleaved transcripts against 3'-5' exonucleolytic degradation, such as the structure 
described by Eckner et a/.(1991), which disclosure is hereby incorporated by reference in its 

35 entirety. 
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The antisense nucleic acids should have a length and melting temperature sufficient to 
permit formation of an intracellular duplex having sufficient stability to inhibit the expression of the 
GENSET mRNA in the duplex. Strategies for designing antisense nucleic acids suitable for use in 
gene therapy are disclosed in Green et a!., (1986) and Izant and Weintraub, (1984), the disclosures 
5 of which are incorporated herein by reference. 

In some strategies, antisense molecules are obtained by reversing the orientation of the 
GENSET coding region with respect to a promoter so as to transcribe the opposite strand from that 
which is normally transcribed in the cell. The antisense molecules may be transcribed using in vitro 
transcription systems such as those which employ T7 or SP6 polymerase to generate the transcript. 

10 Another approach involves transcription of GENSET antisense nucleic acids in vivo by operably 
linking DNA containing the antisense sequence to a promoter in a suitable expression vector. 

Alternatively, oligonucleotides which are complementary to the strand normally transcribed 
in the cell may be synthesized in vitro. Thus, the antisense nucleic acids are complementary to the 
corresponding mRNA and are capable of hybridizing to the mRNA to create a duplex. In some 

15 embodiments, the antisense sequences may contain modified sugar phosphate backbones to increase 
stability and make them less sensitive to RNase activity. Examples of modifications suitable for use 
in antisense strategies include 2' O-methyl RNA oligonucleotides and Protein-nucleic acid (PNA) 
oligonucleotides. Further examples are described by Rossi et a!., (1991), which disclosure is hereby 
incorporated by reference in its entirety. 

20 Various types of antisense oligonucleotides complementary to the sequence of the GENSET 

cDNA or genomic DNA may be used. In one preferred embodiment, stable and semi-stable 
antisense oligonucleotides described in International Application No. PCT WO94/23026, hereby 
incorporated by reference, are used. In these molecules, the 3' end or both the 3' and 5' ends are 
engaged in intramolecular hydrogen bonding between complementary base pairs. These molecules 

25 are better able to withstand exonuclease attacks and exhibit increased stability compared to 
conventional antisense oligonucleotides. 

In another preferred embodiment, the antisense oligodeoxynucleotides against herpes 
simplex virus types 1 and 2 described in International Application No. WO 95/04141, hereby 
incorporated by reference, are used. 

30 In yet another preferred embodiment, the covalently cross-linked antisense oligonucleotides 

described in International Application No. WO 96/31523, hereby incorporated by reference, are 
used. These double- or single-stranded oligonucleotides comprise one or more, respectively, inter- 
or intra-oligonucleotide covalent cross-linkages, wherein the linkage consists of an amide bond 
between a primary amine group of one strand and a carboxyl group of the other strand or of the 

35 same strand, respectively, the primary amine group being directly substituted in the T position of 
the strand nucleotide monosaccharide ring, and the carboxyl group being carried by an aliphatic 
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spacer group substituted on a nucleotide or nucleotide analog of the other strand or the same strand, 
respectively. 

The antisense oligodeoxynucleotides and oligonucleotides disclosed in International 
Application No. WO 92/1 8522, incorporated by reference, may also be used. These molecules are 
5 stable to degradation and contain at least one transcription control recognition sequence which binds 
to control proteins and are effective as decoys therefor. These molecules may contain "hairpin" 
structures, "dumbbell" structures, "modified dumbbell" structures, "cross-linked" decoy structures 
and "loop" structures. 

In another preferred embodiment, the cyclic double-stranded oligonucleotides described in 

10 European Patent Application No. 0 572 287 A2, hereby incorporated by reference are used. These 
ligated oligonucleotide "dumbbells" contain the binding site for a transcription factor and inhibit 
expression of the gene under control of the transcription factor by sequestering the factor. 

Use of the closed antisense oligonucleotides disclosed in International Application No. WO 
92/19732, hereby incorporated by reference, is also contemplated. Because these molecules have 

15 no free ends, they are more resistant to degradation by exonucleases than are conventional 

oligonucleotides. These oligonucleotides may be multifunctional, interacting with several regions 
which are not adjacent to the target rnRNA. 

The appropriate level of antisense nucleic acids required to inhibit gene expression may be 
determined using in vitro expression analysis. The antisense molecule may be introduced into the 

20 cells by diffusion, injection, infection or transfection using procedures known in the art. For 
example, the antisense nucleic acids can be introduced into the body as a bare or naked 
oligonucleotide, oligonucleotide encapsulated in lipid, oligonucleotide sequence encapsidated by 
viral protein, or as an oligonucleotide operably linked to a promoter contained in an expression 
vector. The expression vector may be any of a variety of expression vectors known in the art, 

25 including retroviral or viral vectors, vectors capable of extrachromosomal replication, or integrating 
vectors. The vectors may be DNA or RNA. 

The antisense molecules are introduced onto cell samples at a number of different 
concentrations preferably between lxlO" 10 M to lxlO^M. Once the minimum concentration that can 
adequately control gene expression is identified, the optimized dose is translated into a dosage 

30 suitable for use in vivo. For example, an inhibiting concentration in culture of lxlO" 7 translates into 
a dose of approximately 0.6 mg/kg bodyweight. Levels of oligonucleotide approaching 100 mg/kg 
bodyweight or higher may be possible after testing the toxicity of the oligonucleotide in laboratory 
animals. It is additionally contemplated that cells from the vertebrate are removed, treated with the 
antisense oligonucleotide, and reintroduced into the vertebrate. 

35 In a preferred application of this invention, the polypeptide encoded by the gene is first 

identified, so that the effectiveness of antisense inhibition on translation can be monitored using 
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techniques that include but are not limited to antibody-mediated tests such as RIAs and EL1SA, 
functional assays, or radiolabeling. 

An alternative to the antisense technology that is used according to the present invention 
comprises using ribozymes that will bind to a target sequence via their complementary 
5 polynucleotide tail and that will cleave the corresponding RNA by hydrolyzing its target site 
(namely "hammerhead ribozymes")- Briefly, the simplified cycle of a hammerhead ribozyme 
comprises (1) sequence specific binding to the target RNA via complementary antisense sequences; 
(2) site-specific hydrolysis of the cleavable motif of the target strand; and (3) release of cleavage 
products, which gives rise to another catalytic cycle. Indeed, the use of long-chain antisense 

10 polynucleotide (at least 30 bases long) or ribozymes with long antisense arms are advantageous. A 
preferred delivery system for antisense ribozyme is achieved by covalently linking these antisense 
ribozymes to lipophilic groups or to use liposomes as a convenient vector. Preferred antisense 
ribozymes according to the present invention are prepared as described by Rossi et al, (1991) and 
Sczakiel et a/.(1995), the specific preparation procedures being referred to in said articles being 

1 5 herein incorporated by reference. 

Triple Helix Approach 

The GENSET genomic DNA may also be used to inhibit the expression of the GENSET 
gene based on intracellular triple helix formation. 

Triple helix oligonucleotides are used to inhibit transcription from a genome. They are 

20 particularly useful for studying alterations in cell activity when it is associated with a particular 
gene. The GENSET cDNAs or genomic DNAs of the present invention or, more preferably, a 
fragment of those sequences, can be used to inhibit gene expression in individuals having diseases 
associated with expression of a particular gene. Similarly, a portion of the GENSET genomic DNA 
can be used to study the effect of inhibiting GENSET transcription within a cell. Traditionally, 

25 homopurine sequences were considered the most useful for triple helix strategies. However, 
homopyrimidine sequences can also inhibit gene expression. Such homopyrimidine 
oligonucleotides bind to the major groove at homopurine:homopyrimidine sequences. Thus, both 
types of sequences from the GENSET genomic DNA are contemplated within the scope of this 
invention. 

30 To carry out gene therapy strategies using the triple helix approach, the sequences of the 

GENSET genomic DNA are first scanned to identify 10-mer to 20-mer homopyrimidine or 
homopurine stretches which could be used in triple-helix based strategies for inhibiting GENSET 
expression. Following identification of candidate homopyrimidine or homopurine stretches, their 
efficiency in inhibiting GENSET expression is assessed by introducing varying amounts of 

35 oligonucleotides containing the candidate sequences into tissue culture cells which express the 
GENSET gene. 
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The oligonucleotides can be introduced into the cells using a variety of methods known to 
those skilled in the art, including but not limited to calcium phosphate precipitation, DEAE- 
Dextran, electroporation, liposome-mediated transfection or native uptake. 

Treated cells are monitored for altered cell function or reduced GENSET expression using 
5 techniques such as Northern blotting, RNase protection assays, or PCR based strategies to monitor 
the transcription levels of the GENSET gene in cells which have been treated with the 
oligonucleotide. The cell functions to be monitored are predicted based upon the homologies of the 
target gene corresponding to the cDNA from which the oligonucleotide was derived with known 
gene sequences that have been associated with a particular function. The cell functions can also be 
10 predicted based on the presence of abnormal physiology within cells derived from individuals with 
a particular inherited disease, particularly when the cDNA is associated with the disease using 
techniques described in the section entitled "Identification of genes associated with hereditary 
diseases or drug response". 

The oligonucleotides which are effective in inhibiting gene expression in tissue culture cells 
15 may then be introduced in vivo using the techniques and at a dosage calculated based on the in vitro 
results, as described in the section entitled "Antisense Approach". 

In some embodiments, the natural (beta) anomers of the oligonucleotide units can be 
replaced with alpha anomers to render the oligonucleotide more resistant to nucleases. Further, an 
intercalating agent such as ethidium bromide, or the like, can be attached to the 3' end of the alpha 
20 oligonucleotide to stabilize the triple helix. For information on the generation of oligonucleotides 
suitable for triple helix formation see Griffin et a/.(1989), which is hereby incorporated by this 
reference. 

Treating GENSET-related disorders 

The present invention further relates to methods of treating diseases/disorders by increasing 

25 GENSET activity and/or expression. The invention also relates to methods of treating 

diseases/disorders by decreasing GENSET activity and or expression. These methodologies can be 
effected using compounds selected using screening protocols such as those described herein and/or 
by using the gene therapy and antisense approaches described in the art and herein. Gene therapy 
can be used to effect targeted expression of GENSET. The GENSET coding sequence can be 

30 cloned into an appropriate expression vector and targeted to a particular cell type(s) to achieve 
efficient, high level expression. Introduction of the GENSET coding sequence into target cells can 
be achieved, for example, using particle mediated DNA delivery, (Haynes, 1996 and Maurer, 1999), 
direct injection of naked DNA, (Levy et ai, 1996; and Feigner, 1996), or viral vector mediated 
transport (Smith et al y 1996, Stone et al 9 2000; Wu and Atai, 2000), each of which disclosures are 

35 hereby incorporated by reference in their entireties . Tissue specific effects can be achieved, for 
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example, in the case of virus mediated transport by using viral vectors that are tissue specific, or by 
the use of promoters that are tissue specific. 

Combinatorial approaches can also be used to ensure that the GENSET coding sequence is 
activated in the target tissue (Butt and Karathanasis, 1995; Miller and Whelan, 1997), which 
5 disclosures are hereby incorporated by reference in their entireties. Antisense oligonucleotides 
complementary to GENSET rnRNA can be used to selectively diminish or ablate the expression of 
the protein, for example, at sites of inflammation. More specifically, antisense constructs or 
antisense oligonucleotides can be used to inhibit the production of GENSET in high expressing 
cells such as those cited in the third column of Table X. Antisense rnRNA can be produced by 

10 transfecting into target cells an expression vector with the GENSET gene sequence, or portion 
thereof, oriented in an antisense direction relative to the direction of transcription. Appropriate 
vectors include viral vectors, including retroviral, adenoviral, and adeno-associated viral vectors, as 
well as nonviral vectors. Tissue specific promoters can be used. Alternatively, antisense 
oligonucleotides can be introduced directly into target cells to achieve the same goal. (See also 

1 5 other delivery methodologies described herein in connection with gene therapy.). Oligonucleotides 
can be selected/designed to achieve a high level of specificity (Wagner et a/., 1996), which 
disclosure is hereby incorporated by reference in its entirety. The therapeutic methodologies 
described herein are applicable to both human and non-human mammals (including cats and dogs). 

Pharmaceutical and physiologically acceptable compositions 

20 The present invention also relates to pharmaceutical or physiologically acceptable 

compositions comprising, as active agent, the polypeptides, nucleic acids or antibodies of the 
invention. The invention also relates to compositions comprising, as active agent, compounds 
selected using the above-described screening protocols. Such compositions include the active agent 
in combination with a pharmaceutical or physiologically acceptably acceptable carrier. In the case 

25 of naked DNA, the "carrier" may be gold particles. The amount of active agent in the composition 
can vary with the agent, the patient and the effect sought. Likewise, the dosing regimen can vary 
depending on the composition and the disease/disorder to be treated. 

Therefore, the invention related to methods for the production of pharmaceutical 
composition comprising a method for selecting an active agent, compound, substance or molecule 

30 using any of the screening method described herein and furthermore mixing the identified active 
agent, compound, substance or molecule with a pharmaceutical^ acceptable carrier. 

The pharmaceutical compositions utilized in this invention may be administered by any 
number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, 
intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, 

35 enteral, topical, sublingual, or rectal means. In addition to the active ingredients, these 

pharmaceutical compositions may contain suitable pharmaceutical^ acceptable carriers comprising 
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excipients and auxiliaries which facilitate processing of the active compounds into preparations 
which can be used pharmaceutical^. Further details on techniques for formulation and 
administration may be found in the latest edition of Remington's Pharmaceutical Sciences (Maack 
PublishingCo. Easton, Pa). 
5 Pharmaceutical compositions for oral administration can be formulated using 

pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the 
patient. 

10 Pharmaceutical preparations for oral use can be obtained through a combination of active 

compounds with solid excipient, suiting mixture is optionally grinding, and processing the mixture 
of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable 
excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, mannitol, or 
sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as methyl cellulose, 

15 hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums including arabic and 
tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing 
agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt 
thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated sugar 

20 solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, 

polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and suitable organic solvents or 
solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for product 
identification or to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 

25 gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. 
Push-fit capsules can contain active ingredients mixed with a filler or binders, such as lactose or 
starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, 
liquid, or liquidpolyethylene glycol with or without stabilizers. 

30 Pharmaceutical formulations suitable for parenteral administration may be formulated in 

aqueous solutions, preferably in physiologically compatible buffers such as Hanks solution, 
Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethylcellulose, 
sorbitol, or dextran. Additionally, suspensions of the active compounds may be prepared as 

35 appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. 
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Optionally, the suspension may also contain suitable stabilizers or agents which increase the 
solubility of the compounds to allow for the preparation of highly concentrated solutions. 

For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
5 The pharmaceutical compositions of the present invention may be manufactured in a 

manner that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, etc. 

10 Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free 
base forms. In other cases, the preferred preparation may be a lyophilized powder which may 
contain any or all of the following: 1-50 mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at a 
pH range of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an 

15 appropriate container and labeled for treatment of an indicated condition. For administration of 
GENSET, such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions wherein 
the active ingredients are contained in an effective amount to achieve the intended purpose. The 
determination of an effective dose is well within the capability of those skilled in the art. 

20 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or pigs. 
The animal model may also be used to determine the appropriate concentration range and route of 
administration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

25 A therapeutically effective dose refers to that amount of active ingredient, for example 

GENSET or fragments thereof, antibodies of GENSET, agonists, antagonists or inhibitors of 
GENSET, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 

30 50% of the population). The dose ratio between therapeutic and toxic effects is the therapeutic 
index, and it can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which 
exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The dosage contained in 
such compositions is preferably within a range of circulating concentrations that include the ED50 

35 with little or no toxicity. The dosage varies within this range depending upon the dosage form 
employed, sensitivity of the patient, and the route of administration. 
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The exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels 
of the active moiety or to maintain the desired effect. Factors which may be taken into account 
include the severity of the disease state, general health of the subject, age, weight, and gender of the 
5 subject, diet, time and frequency of administration, drug combination(s), reaction sensitivities, and 
tolerance/response to therapy. Long-acting pharmaceutical compositions maybe administered every 
3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the 
particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 
10 about 1 g, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 

15 Uses of GENSET sequences: computer-Related Embodiments 

As used herein the term " cDNA codes of SEP ID Nos: 1-241 " encompasses the nucleotide 
sequences of SEQ ID Nos: 1-241 and of clones inserts of the deposited clone pool, fragments 
thereof, nucleotide sequences homologous thereto, and sequences complementary to all of the 
preceding sequences. The fragments include fragments of SEQ ID Nos: 1-241 comprising at least 

20 8, 10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 1000 or 2000 

consecutive nucleotides of SEQ ID Nos: 1-241. Preferably the fragments include signal sequences 
and coding sequences for mature polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides 
described in Tables Va and Table Vb, polynucleotides encoding polypeptides described in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 

25 comprising at least 8, 10, 12, 15, 18, 20, 25, 28, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, 500, 
1000 or 2000 consecutive nucleotides of the signal sequences or coding sequences for mature 
polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides encoding polypeptides described in Table VI, and polynucleotide described 
herein as encoding polypeptides having a biological activity. Homologous sequences and fragments 

30 of SEQ ID Nos: 1 -241 refer to a sequence having at least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 
80%, or 75% identity to these sequences. Identity may be determined using any of the computer 
programs and parameters described herein, including BLAST2N with the default parameters or with 
any modified parameters. Homologous sequences also include RNA sequences in which uridines 
replace the thymines in the cDNA codes of SEQ ID Nos: 1-241 . The homologous sequences may 

35 be obtained using any of the procedures described herein or may result from the correction of a 
sequencing error as described above. Preferably the homologous sequences and fragments of SEQ 
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ID Nos: 1-241 include polynucleotides homologous to signal sequences and coding sequences for 
mature polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and 
Table Vb, polynucleotides encoding a polypeptide fragment described as a domain in Table VI, 
polynucleotide described herein as encoding polypeptides having a biological activity, or fragments 
5 comprising at least 8, 10, 12, 15, 18,20,25,28,30,35,40,50,75, 100, 150,200,300,400,500, 
1000 or 2000 consecutive nucleotides of the signal sequences and coding sequences for mature 
polypeptides of SEQ ID Nos: 1-31 and 33-143, polynucleotides described in Tables Va and Table 
Vb, polynucleotides described in Table VI, and polynucleotide described herein as encoding 
polypeptides having a biological activity. It will be appreciated that the cDNA codes of SEQ ED 

10 Nos: 1-241 can be represented in the traditional single character format (See the inside back cover 
of Styer, 1995) or in any other format which records the identity of the nucleotides in a sequence. 

As used herein the term " polypeptide codes of SEP ID Nos: 242-482 " encompasses the 
polypeptide sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242- 
272 and 274-384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full- 

15 length, signal peptides and mature polypeptide sequences encoded by the clone inserts of the 
deposited clone pool, polypeptide sequences homologous thereto, or fragments of any of the 
preceding sequences. Homologous polypeptide sequences refer to a polypeptide sequence having at 
least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75% identity to one of the polypeptide 
sequences of SEQ ID Nos: 242-482, the signal peptides included in SEQ ID Nos: 242-272 and 274- 

20 384, the mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, the full-length, signal 
peptides and mature polypeptide sequences encoded by the clone inserts of the deposited clone 
pool. Identity may be determined using any of the computer programs and parameters described 
herein, including FASTA with the default parameters or with any modified parameters. The 
homologous sequences may be obtained using any of the procedures described herein or may result 

25 from the correction of a sequencing error as described above. The polypeptide fragments comprise 
at least 5, 6, 8, 10, 12, 15, 20, 25, 30, 35, 40, 50, 60, 75, 100, 150, 200, 250, 300, 350, 400, 450 or 
500 consecutive amino acids of the polypeptides of SEQ ID Nos: 242-482. Preferably, the 
fragments include polypeptides encoded by the signal peptides included in SEQ ID Nos: 242-272 
and 274-384, mature polypeptides included in SEQ ID Nos: 242-272 and 274-384, polynucleotides 

30 described in Tables Va and in Table Vb, domains described in Table VI, epitopes described in 

Table VII, polypeptides described herein as having a biological activity, or fragments comprising at 
least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300 or 400 consecutive amino acids of the 
signal peptides included in SEQ ID Nos: 242-272 and 274-384, mature polypeptides included in 
SEQ ID Nos: 242-272 and 274-384, the polypeptides encoded by the polynucleotides described in 

35 Tables Va and in Table Vb, domains of Table VI, epitopes of Table VII or of polypeptides 

described herein as having a biological activity. It will be appreciated that the polypeptide codes of 
the SEQ ED Nos: 242-482 can be represented in the traditional single character format or three letter 
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format (See the inside back cover of Stryer, 1995) or in any other format which relates the identity 
of the polypeptides in a sequence. 

It will be appreciated by those skilled in the art that the nucleic acid codes of the invention 
and polypeptide codes of the invention can be stored, recorded, and manipulated on any medium 
5 which can be read and accessed by a computer. As used herein, the words "recorded" and "stored" 
refer to a process for storing information on a computer medium. A skilled artisan can readily 
adopt any of the presently known methods for recording information on a computer readable 
medium to generate manufactures comprising one or more of the nucleic acid codes of the 
invention, or one or more of the polypeptide codes of the invention. Another aspect of the present 

10 invention is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 
50, 75, 100, 150 or 200 nucleic acid codes of the invention. Another aspect of the present invention 
is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 
150 or 200 polypeptide codes of the invention. 

Computer readable media include magnetically readable media, optically readable media, 

15 electronically readable media and magnetic/optical media. For example, the computer readable 
media may be a hard disk, a floppy disk, a magnetic tape, CD-ROM, Digital Versatile Disk (DVD), 
Random Access Memory (RAM), or Read Only Memory (ROM) as well as other types of other 
media known to those skilled in the art. 

Embodiments of the present invention include systems, particularly computer systems 

20 which store and manipulate the sequence information described herein. One example of a computer 
system 100 is illustrated in block diagram form in Figure 2. As used herein, "a computer system" 
refers to the hardware components, software components, and data storage components used to 
analyze the nucleotide sequences of the nucleic acid codes of the invention or the amino acid 
sequences of the polypeptide codes of the invention. In one embodiment, the computer system 100 

25 is a Sun Enterprise 1000 server (Sun Microsystems, Palo Alto, CA). The computer system 100 
preferably includes a processor for processing, accessing and manipulating the sequence data. The 
processor 105 can be any well-known type of central processing unit, such as the Pentium III from 
Intel Corporation, or similar processor from Sun, Motorola, Compaq or International Business 
Machines. 

30 Preferably, the computer system 100 is a general purpose system that comprises the 

processor 105 and one or more internal data storage components 1 10 for storing data, and one or 
more data retrieving devices for retrieving the data stored on the data storage components. A 
skilled artisan can readily appreciate that any one of the currently available computer systems are 
suitable. 

35 In one particular embodiment, the computer system 100 includes a processor 105 connected 

to a bus which is connected to a main memory 115 (preferably implemented as RAM) and one or 
more internal data storage devices 1 10, such as a hard drive and/or other computer readable media 
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having data recorded thereon. In some embodiments, the computer system 100 further includes one 
or more data retrieving device 1 18 for reading the data stored on the internal data storage devices 
110. 

The data retrieving device 1 1 8 may represent, for example, a floppy disk drive, a compact 
5 disk drive, a magnetic tape drive, etc. In some embodiments, the internal data storage device 1 1 0 is 
a removable computer readable medium such as a floppy disk, a compact disk, a magnetic tape, etc. 
containing control logic and/or data recorded thereon. The computer system 100 may 
advantageously include or be programmed by appropriate software for reading the control logic 
and/or the data from the data storage component once inserted in the data retrieving device. 
10 The computer system 100 includes a display 120 which is used to display output to a 

computer user. It should also be noted that the computer system 100 can be linked to other 
computer systems 125a-c in a network or wide area network to provide centralized access to the 
computer system 100. 

Software for accessing and processing the nucleotide sequences of the nucleic acid codes of 

15 the invention or the amino acid sequences of the polypeptide codes of the invention (such as search 
tools, compare tools, and modeling tools etc.) may reside in main memory 115 during execution. 

In some embodiments, the computer system 100 may further comprise a sequence comparer 
for comparing the above-described nucleic acid codes of the invention or the polypeptide codes of 
the invention stored on a computer readable medium to reference nucleotide or polypeptide 

20 sequences stored on a computer readable medium. A "sequence comparer" refers to one or more 
programs which are implemented on the computer system 100 to compare a nucleotide or 
polypeptide sequence with other nucleotide or polypeptide sequences and/or compounds including 
but not limited to peptides, peptidomimetics, and chemicals stored within the data storage means. 
For example, the sequence comparer may compare the nucleotide sequences of nucleic acid codes 

25 of the invention or the amino acid sequences of the polypeptide codes of the invention stored on a 
computer readable medium to reference sequences stored on a computer readable medium to 
identify homologies, motifs implicated in biological function, or structural motifs. The various 
sequence comparer programs identified elsewhere in this patent specification are particularly 
contemplated for use in this aspect of the invention. 

30 Figure 3 is a flow diagram illustrating one embodiment of a process 200 for comparing a 

new nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and the sequences in the database. The database of 
sequences can be a private database stored within the computer system 100, or a public database 
such as GENBANK, PIR OR SW1SSPROT that is available through the Internet. 

35 The process 200 begins at a start state 201 and then moves to a state 202 wherein the new 

sequence to be compared is stored to a memory in a computer system 100. As discussed above, the 
memory could be any type of memory, including RAM or an internal storage device. 
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The process 200 then moves to a state 204 wherein a database of sequences is opened for 
analysis and comparison. The process 200 then moves to a state 206 wherein the first sequence 
stored in the database is read into a memory on the computer. A comparison is then performed at a 
state 210 to determine if the first sequence is the same as the second sequence. It is important to 
5 note that this step is not limited to performing an exact comparison between the new sequence and 
the first sequence in the database. Well-known methods are known to those of skill in the art for 
comparing two nucleotide or protein sequences, even if they are not identical. For example, gaps 
can be introduced into one sequence in order to raise the homology level between the two tested 
sequences. The parameters that control whether gaps or other features are introduced into a 

10 sequence during comparison are normally entered by the user of the computer system. 

Once a comparison of the two sequences has been performed at the state 210, a 
determination is made at a decision state 210 whether the two sequences are the same. Of course, 
the term "same" is not limited to sequences that are absolutely identical. Sequences that are within 
the homology parameters entered by the user will be marked as "same" in the process 200. 

15 If a determination is made that the two sequences are the same, the process 200 moves to a 

state 214 wherein the name of the sequence from the database is displayed to the user. This state 
notifies the user that the sequence with the displayed name fulfills the homology constraints that 
were entered. Once the name of the stored sequence is displayed to the user, the process 200 moves 
to a decision state 218 wherein a determination is made whether more sequences exist in the 

20 database. If no more sequences exist in the database, then the process 200 terminates at an end state 
220. However, if more sequences do exist in the database, then the process 200 moves to a state 
224 wherein a pointer is moved to the next sequence in the database so that it can be compared to 
the new sequence. In this manner, the new sequence is aligned and compared with every sequence 
in the database. 

25 It should be noted that if a determination had been made at the decision state 212 that the 

sequences were not homologous, then the process 200 would move immediately to the decision 
state 21 8 in order to determine if any other sequences were available in the database for 
comparison. 

Accordingly, one aspect of the present invention is a computer system comprising a 
30 processor, a data storage device having stored thereon a nucleic acid code of the invention or a 
polypeptide code of the invention,. In some embodiments the computer system further comprises a 
data storage device having retrievably stored thereon reference nucleotide sequences or polypeptide 
sequences to be compared to the nucleic acid code of the invention or polypeptide code of the 
invention and a sequence comparer for conducting the comparison. For example, the sequence 
35 comparer may comprise a computer program which indicates polymorphisms. In other aspects of 
the computer system, the system further comprises an identifier which identifies features in said 
sequence. The sequence comparer may indicate a homology level between the sequences compared 
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or identify motifs implicated in biological function and structural motifs in the nucleic acid code of 
the invention and polypeptide codes of the invention or it may identify structural motifs in 
sequences which are compared to these nucleic acid codes and polypeptide codes. In some 
embodiments, the data storage device may have stored thereon the sequences of at least 2, 5, 10, 15, 
5 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic acid codes of the invention or polypeptide codes 
of the invention. 

Another aspect of the present invention is a method for determining the level of homology 
between a nucleic acid code of the invention and a reference nucleotide sequence, comprising the 
steps of reading the nucleic acid code and the reference nucleotide sequence through the use of a 

10 computer program which determines homology levels and determining homology between the 

nucleic acid code and the reference nucleotide sequence with the computer program. The computer 
program may be any of a number of computer programs for determining homology levels, including 
those specifically enumerated herein, including BLAST2N with the default parameters or with any 
modified parameters. The method may be implemented using the computer systems described 

15 above. The method may also be performed by reading 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 
200 of the above described nucleic acid codes of the invention through the use of the computer 
program and determining homology between the nucleic acid codes and reference nucleotide 
sequences. 

Figure 4 is a flow diagram illustrating one embodiment of a process 250 in a computer for 
20 determining whether two sequences are homologous. The process 250 begins at a start state 252 and 
then moves to a state 254 wherein a first sequence to be compared is stored to a memory. The 
second sequence to be compared is then stored to a memory at a state 256. The process 250 then 
moves to a state 260 wherein the first character in the first sequence is read and then to a state 262 
wherein the first character of the second sequence is read. It should be understood that if the 
25 sequence is a nucleotide sequence, then the character would normally be either A, T, C, G or U. If 
the sequence is a protein sequence, then it should be in the single letter amino acid code so that the 
first and sequence sequences can be easily compared. 

A determination is then made at a decision state 264 whether the two characters are the 
same. If they are the same, then the process 250 moves to a state 268 wherein the next characters in 
30 the first and second sequences are read. A determination is then made whether the next characters 
are the same. If they are, then the process 250 continues this loop until two characters are not the 
same. If a determination is made that the next two characters are not the same, the process 250 
moves to a decision state 274 to determine whether there are any more characters either sequence to 
read. 

35 If there aren't any more characters to read, then the process 250 moves to a state 276 

wherein the level of homology between the first and second sequences is displayed to the user. The 
level of homology is determined by calculating the proportion of characters between the sequences 
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that were the same out of the total number of sequences in the first sequence. Thus, if every 
character in a first 100 nucleotide sequence aligned with a every character in a second sequence, the 
homology level would be 100%. 

Alternatively, the computer program may be a computer program which compares the 
5 nucleotide sequences of the nucleic acid codes of the present invention, to reference nucleotide 
sequences in order to determine whether the nucleic acid code of the invention differs from a 
reference nucleic acid sequence at one or more positions. Optionally such a program records the 
length and identity of inserted, deleted or substituted nucleotides with respect to the sequence of 
either the reference polynucleotide or the nucleic acid code of the invention. In one embodiment, 

10 the computer program may be a program which determines whether the nucleotide sequences of the 
nucleic acid codes of the invention contain one or more single nucleotide polymorphisms (SNP) 
with respect to a reference nucleotide sequence. These single nucleotide polymorphisms may each 
comprise a single base substitution, insertion, or deletion. 

Another embodiment of the present invention is a method for comparing a first sequence to 

15 a reference sequence wherein the first sequence is selected from the group consisting of a cDNA 
code of SEQBD NOs. 1-297 and a polypeptide code of SEQ ID NOs. 298-594 comprising the steps 
of reading the first sequence and the reference sequence through use of a computer program which 
compares sequences and determining differences between the first sequence and the reference 
sequence with the computer program. In some aspects of this embodiment, said step of determining 

20 differences between the first sequence and the reference sequence comprises identifying 
polymorphisms. 

Another aspect of the present invention is a method for determining the level of homology 
between a polypeptide code of the invention and a reference polypeptide sequence, comprising the 
steps of reading the polypeptide code of the invention and the reference polypeptide sequence 

25 through use of a computer program which determines homology levels and determining homology 
between the polypeptide code and the reference polypeptide sequence using the computer program. 

Accordingly, another aspect of the present invention is a method for determining whether a 
nucleic acid code of the invention differs at one or more nucleotides from a reference nucleotide 
sequence comprising the steps of reading the nucleic acid code and the reference nucleotide 

30 sequence through use of a computer program which identifies differences between nucleic acid 
sequences and identifying differences between the nucleic acid code and the reference nucleotide 
sequence with the computer program. In some embodiments, the computer program is a program 
which identifies single nucleotide polymorphisms The method may be implemented by the 
computer systems described above and the method illustrated in Figure 4. The method may also be 

35 performed by reading at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic acid 
codes of the invention and the reference nucleotide sequences through the use of the computer 
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program and identifying differences between the nucleic acid codes and the reference nucleotide 
sequences with the computer program. 

Thus, another embodiment of the present invention is a method for comparing a first 
sequence to a reference sequence wherein the first sequence is selected from the group consisting of 
5 the nucleic acid codes of the present invention or the polypeptide codes of the present invention 
comprising the steps of reading the first sequence and the reference sequence through use of a 
computer program which compares sequences and determining differences between the first 
sequence and the reference sequence with the computer program. In some aspects of this 
embodiment, said step of determining differences between the first sequence and the reference 

10 sequence comprises identifying polymorphisms. 

Another aspect of the present invention is a method for determining the level of identity 
between a first sequence and a reference sequence, wherein the first sequence is selected from the 
group consisting of the nucleic acid codes of the present invention or the polypeptide codes of the 
present invention, comprising the steps of reading the first sequence and the reference sequence 

15 through the use of a computer program which determines identity levels and determining identity 
between the first sequence and the reference sequence with the computer program. 

In other embodiments the computer based system may further comprise an identifier for 
identifying features within the nucleotide sequences of the nucleic acid codes of the invention or the 
amino acid sequences of the polypeptide codes of the invention. An "identifier" refers to one or 

20 more programs which identifies certain features within the above-described nucleotide sequences of 
the nucleic acid codes of the invention or the amino acid sequences of the polypeptide codes of the 
invention. In one embodiment, the identifier may comprise a program which identifies an open 
reading frame in the cDNAs codes of the invention. 

Another embodiment of the present invention is a method for identifying a feature in a 

25 sequence selected from the group consisting of the nucleic acid codes of the invention or the amino 
acid sequences of the polypeptide codes of the invention comprising the steps of reading the 
sequence through the use of a computer program which identifies features in sequences and 
identifying features in the sequence with said computer program. In one aspect of this embodiment, 
the computer program comprises a computer program which identifies open reading frames. In a 

30 further embodiment, the computer program comprises a program that identifies linear or structural 
motifs in a polypeptide sequence. 

Figure 5 is a flow diagram illustrating one embodiment of an identifier process 300 for 
detecting the presence of a feature in a sequence. The process 300 begins at a start state 302 and 
then moves to a state 304 wherein a first sequence that is to be checked for features is stored to a 

35 memory 1 15 in the computer system 100. The process 300 then moves to a state 306 wherein a 
database of sequence features is opened. Such a database would include a list of each feature's 
attributes along with the name of the feature. For example, a feature name could be "Initiation 
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r Codon" and the attribute would be "ATG". Another example would be the feature name 

"TAATAA Box" and the feature attribute would be "TAATAA". An example of such a database is 
produced by the University of Wisconsin Genetics Computer Group (www.gcg.com). 

Once the database of features is opened at the state 306, the process 300 moves to a state 
5 308 wherein the first feature is read from the database. A comparison of the attribute of the first 
feature with the first sequence is then made at a state 310. A determination is then made at a 
decision state 316 whether the attribute of the feature was found in the first sequence. If the 
attribute was found, then the process 300 moves to a state 318 wherein the name of the found 
feature is displayed to the user. 

10 The process 300 then moves to a decision state 320 wherein a determination is made 

whether move features exist in the database. If no more features do exist, then the process 300 
terminates at an end state 324. However, if more features do exist in the database, then the process 
300 reads the next sequence feature at a state 326 and loops back to the state 310 wherein the 
attribute of the next feature is compared against the first sequence. 

15 It should be noted, that if the feature attribute is not found in the first sequence at the 

decision state 316, the process 300 moves directly to the decision state 320 in order to determine if 
any more features exist in the database. 

In another embodiment, the identifier may comprise a molecular modeling program which 
determines the 3-dimensional structure of the polypeptides codes of the invention. Such programs 

20 may use any methods known to those skilled in the art including methods based on homology- 
modeling, fold recognition and ab initio methods as described in Sternberg et ai, 1999, which 
disclosure is hereby incorporated by reference in its entirety. In some embodiments, the molecular 
modeling program identifies target sequences that are most compatible with profiles representing 
the structural environments of the residues in known three-dimensional protein structures. (See, 

25 e.g., Eisenberg et ah, U.S. Patent No. 5,436,850 issued July 25, 1995, which disclosure is hereby 
incorporated by reference in its entirety). In another technique, the known three-dimensional 
structures of proteins in a given family are superimposed to define the structurally conserved 
regions in that family. This protein modeling technique also uses the known three-dimensional 
structure of a homologous protein to approximate the structure of the polypeptide codes of the 

30 invention. (See e.g., Srinivasan, et al., U.S. Patent No. 5,557,535 issued September 17, 1996, which 
disclosure is hereby incorporated by reference in its entirety). Conventional homology modeling 
techniques have been used routinely to build models of proteases and antibodies. (Sowdhamini et 
al., (1997)). Comparative approaches can also be used to develop three-dimensional protein models 
when the protein of interest has poor sequence identity to template proteins. In some cases, proteins 

35 fold into similar three-dimensional structures despite having very weak sequence identities. For 
example, the three-dimensional structures of a number of helical cytokines fold in similar three- 
dimensional topology in spite of weak sequence homology. 
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The recent development of threading methods now enables the identification of likely * i 

folding patterns in a number of situations where the structural relatedness between target and 
template(s) is not detectable at the sequence level. Hybrid methods, in which fold recognition is 
performed using Multiple Sequence Threading (MST), structural equivalencies are deduced from 
5 the threading output using a distance geometry program DRAGON to construct a low resolution 
model, and a full-atom representation is constructed using a molecular modeling package such as 
QUANTA. 

According to this 3-step approach, candidate templates are first identified by using the 
novel fold recognition algorithm MST, which is capable of performing simultaneous threading of 

10 multiple aligned sequences onto one or more 3-D structures. In a second step, the structural 

equivalencies obtained from the MST output are converted into interresidue distance restraints and 
fed into the distance geometry program DRAGON, together with auxiliary information obtained 
from secondary structure predictions. The program combines the restraints in an unbiased manner 
and rapidly generates a large number of low resolution model confirmations. In a third step, these 

15 low resolution model confirmations are converted into full-atom models and subjected to energy 
minimization using the molecular modeling package QUANTA. (See e.g., Aszodi et al. 9 (1997)). 

The results of the molecular modeling analysis may then be used in rational drug design 
techniques to identify agents which modulate the activity of the polypeptide codes of the invention. 
Accordingly, another aspect of the present invention is a method of identifying a feature 

20 within the nucleic acid codes of the invention or the polypeptide codes of the invention comprising 
reading the nucleic acid code(s) or the polypeptide code(s) through the use of a computer program 
which identifies features therein and identifying features within the nucleic acid code(s) or 
polypeptide code(s) with the computer program. In one embodiment, computer program comprises 
a computer program which identifies open reading frames. In a further embodiment, the computer 

25 program identifies linear or structural motifs in a polypeptide sequence. In another embodiment, 
the computer program comprises a molecular modeling program. The method may be performed by 
reading a single sequence or at least 2, 5, 10, 15, 20, 25, 30, 50, 75, 100, 150 or 200 of the nucleic 
acid codes of the invention or the polypeptide codes of the invention through the use of the 
computer program and identifying features within the nucleic acid codes or polypeptide codes with 

30 the computer program. 

The nucleic acid codes of the invention or the polypeptide codes of the invention may be 
stored and manipulated in a variety of data processor programs in a variety of formats. For 
example, they may be stored as text in a word processing file, such as MicrosoftWORD or 
WORDPERFECT or as an ASCII file in a variety of database programs familiar to those of skill in 

35 the art, such as DB2, SYBASE, or ORACLE. In addition, many computer programs and databases 
may be used as sequence comparers, identifiers, or sources of reference nucleotide or polypeptide 
sequences to be compared to the nucleic acid codes of the invention or the polypeptide codes of the 
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invention. The following list is intended not to limit the invention but to provide guidance to 
programs and databases which are useful with the nucleic acid codes of the invention or the 
polypeptide codes of the invention. The programs and databases which may be used include, but 
are not limited to: MacPattern (EMBL), DiscoveryBase (Molecular Applications Group), GeneMine 
5 (Molecular Applications Group), Look (Molecular Applications Group), MacLook (Molecular 
Applications Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX (Altschul et al, 1990), 
FASTA (Pearson and Lipman, 1988), FASTDB (Brutlag et al., 1990), Catalyst (Molecular 
Simulations Inc.), Catalyst/SHAPE (Molecular Simulations Inc.), Cerius2.DBAccess (Molecular 
Simulations Inc.), HypoGen (Molecular Simulations Inc.), Insight II, (Molecular Simulations Inc.), 

10 Discover (Molecular Simulations Inc.), CHARMm (Molecular Simulations Inc.), Felix (Molecular 
Simulations Inc.), DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular Simulations Inc.), 
Homology (Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS (Molecular 
Simulations Inc.), Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular 
Simulations Inc.), WebLab Diversity Explorer (Molecular Simulations Inc.), Gene Explorer 

15 (Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), the EMBL/Swissprotein 

database, the MDL Available Chemicals Directory database, the MDL Drug Data Report data base, 
the Comprehensive Medicinal Chemistry database, Derwents's World Drug Index database, the 
BioByteMasterFile database, the Genbank database, and the Genseqn database. Many other 
programs and data bases would be apparent to one of skill in the art given the present disclosure: 

20 Motifs which may be detected using the above programs include sequences encoding 

leucine zippers, helix-turn-helix motifs, glycosylation sites, ubiquitination sites, alpha helices, and 
beta sheets, signal sequences encoding signal peptides which direct the secretion of the encoded 
proteins, sequences implicated in transcription regulation such as homeoboxes, acidic stretches, 
enzymatic active sites, substrate binding sites, and enzymatic cleavage sites. 

25 Conclusion 

As discussed above, the GENSET polynucleotides and polypeptides of the present 
invention or fragments thereof can be used for various purposes. The polynucleotides can be used 
to express recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either constitutively or at a 

30 particular stage of tissue differentiation or development or in disease states); as molecular weight 
markers on Southern gels; as chromosome markers or tags (when labeled) to identify chromosomes 
or to map related gene positions; as a reagent (including a labeled reagent) in assays designed to 
quantitatively determine levels of GENSET expression in biological samples; to compare with 
endogenous DNA sequences in patients to identify potential genetic disorders; as probes to 

35 hybridize and thus discover novel, related DNA sequences; as a source of information to derive 
PCR primers for genetic fingerprinting; for selecting and making oligomers for attachment to a 
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"gene chip" or other support, including for examination for expression patterns; to raise anti-protein 
antibodies using DNA immunization techniques; and as an antigen to raise anti-DNA antibodies or 
elicit another immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
5 polynucleotide can also be used in interaction trap assays (such as, for example, that described in 
Gyuris et al. t (1993) to identify polynucleotides encoding the other protein with which binding 
occurs or to identify inhibitors of the binding interaction. 

The proteins or polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for high-throughput 

10 screening; to raise antibodies or to elicit another immune response; as a reagent (including the 

labeled reagent) in assays designed to quantitatively determine levels of the protein (or its receptor) 
in biological fluids; as markers for tissues in which the corresponding protein is preferentially 
expressed (either constitutively or at a particular stage of tissue differentiation or development or in 
a disease state); and, of course, to isolate correlative receptors or ligands. Where the protein binds 

15 or potentially binds to another protein (such as, for example, in a receptor-ligand interaction), the 
protein can be used to identify the other protein with which binding occurs or to identify inhibitors 
of the binding interaction. Proteins involved in these binding interactions can also be used to screen 
for peptide or small molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or kit 

20 format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning; A Laboratory 
Manual", 2d ed., Cole Spring Harbor Laboratory Press, Sambrook, J., E.F. Fritsch and T. Maniatis 
eds., 1989, and "Methods in Enzymology; Guide to Molecular Cloning Techniques", Academic 

25 Press, Berger and Kimmel eds., 1987, which disclosures are hereby incorporated by reference in 
their entireties. > 

Polynucleotides and proteins of the present invention can also be used as nutritional sources 
or supplements. Such uses include without limitation use as a protein or amino acid supplement, 
use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In such cases 

30 the protein or polynucleotide of the invention can be added to the feed of a particular organism or 
can be administered as a separate solid or liquid preparation, such as in the form of powder, pills, 
solutions, suspensions or capsules. In the case of microorganisms, the protein or polynucleotide of 
the invention can be added to the medium in or on which the microorganism is cultured. 

Although this invention has been described in terms of certain preferred embodiments, other 

35 embodiments which will be apparent to those of ordinary skill in the art in view of the disclosure 
herein are also within the scope of this invention. Accordingly, the scope of the invention is 
intended to be defined only by reference to the appended claims. 

451 

BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 

Examples 



PCT/IB00/01938 



Preparation of Antibody Compositions to GENSET proteins 

Substantially pure protein or polypeptide is isolated from transfected or transformed cells 
containing an expression vector encoding a GENSET protein or a portion thereof. The 
5 concentration of protein in the final preparation is adjusted, for example, by concentration on an 
Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows: 

A. Monoclonal Antibody Production by Hvbridoma Fusion 

Monoclonal antibody to epitopes in the GENSET protein or a portion thereof can be 
10 prepared from murine hybridomas according to the classical method of Kohler and Milstein, (1975) 
or derivative methods thereof Also see Harlow and Lane. (1988).. 

Briefly, a mouse is repetitively inoculated with a few micrograms of the GENSET protein 
or a portion thereof over a period of a few weeks. The mouse is then sacrificed, and the antibody 
producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol 
15 with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on 

selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and 
aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is 
continued. Antibody-producing clones are identified by detection of antibody in the supernatant 
fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall, 
20 (1980), which disclosure is hereby incorporated by reference in its entirety, and derivative methods 
thereof Selected positive clones can be expanded and their monoclonal antibody product harvested 
for use. Detailed procedures for monoclonal antibody production are described in Davis, et al. 
(1986) Section 21-2. 

B. Polyclonal Antibody Production by Immunization 

25 Polyclonal antiserum containing antibodies to heterogeneous epitopes in the GENSET 

protein or a portion thereof can be prepared by immunizing suitable non-human animal with the 
GENSET protein or a portion thereof, which can be unmodified or modified to enhance 
immunogenicity. A suitable non-human animal is preferably a non-human mammal is selected, 
usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crude preparation which has been 

30 enriched for GENSET concentration can be used to generate antibodies. Such proteins, fragments 

or preparations are introduced into the non-human mammal in the presence of an appropriate 

adjuvant (e.g. aluminum hydroxide, RJOBI, etc.) which is known in the art. In addition the protein, 

fragment or preparation can be pretreated with an agent which will increase antigenicity, such 

agents are known in the art and include, for example, methylated bovine serum albumin (mBSA), 

35 bovine serum albumin (BSA), Hepatitis B surface antigen, and keyhole limpet hemocyanin (KLH). 
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Serum from the immunized animal is collected, treated and tested according to known procedures. 
If the serum contains polyclonal antibodies to undesired epitopes, the polyclonal antibodies can be 
purified by immunoaffinity chromatography. 

Effective polyclonal antibody production is affected by many factors related both to the 
5 antigen and the host species. Also, host animals vary in response to site of inoculations and dose, 
with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng 
level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques 
for producing and processing polyclonal antisera are known in the art. An effective immunization 
protocol for rabbits can be found in Vaitukaitis et al. (1971), which disclosure is hereby 

10 incorporated by reference in its entirety. 

Booster injections can be given at regular intervals, and antiserum harvested when antibody 
titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar 
against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al. 9 
(1973), which disclosure is hereby incorporated by reference in its entirety. Plateau concentration 

15 of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12 uM). Affinity of the 
antisera for the antigen is determined by preparing competitive binding curves, as described, for 
example, by Fisher (1980), which disclosure is hereby incorporated by reference in its entirety. 

Antibody preparations prepared according to either the monoclonal or the polyclonal 
protocol are useful in quantitative immunoassays which determine concentrations of antigen- 

20 bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to 
identify the presence of antigen in a biological sample. The antibodies may also be used in 
therapeutic compositions for killing cells expressing the protein or reducing the levels of the protein 
in the body. 

Biological assays 

25 Assaying GENSET Secreted Proteins to Determine Whether they Bind to the Cell Surface 

The secreted proteins encoded by the GENSET cDNAs, preferably the proteins of SEQ ID 
NOs: 242-272 and 274-384, or fragments thereof are cloned into expression vectors. The proteins 
are purified by size, charge, immunochromatography or other techniques familiar to those skilled in 
the art. Following purification, the proteins are labeled using techniques known to those skilled in 

30 the art. The labeled proteins are incubated with cells or cell lines derived from a variety of organs 
or tissues to allow the proteins to bind to any receptor present on the cell surface. Following the 
incubation, the cells are washed to remove non-specifically bound protein. The labeled proteins are 
detected by autoradiography. Alternatively, unlabeled proteins may be incubated with the cells and 
detected with antibodies having a detectable label, such as a fluorescent molecule, attached thereto. 

35 Specificity of cell surface binding may be analyzed by conducting a competition analysis in 

which various amounts of unlabeled protein are incubated along with the labeled protein. The 
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amount of labeled protein bound to the cell surface decreases as the amount of competitive 
unlabeled protein increases. As a control, various amounts of an unlabeled protein unrelated to the 
labeled protein is included in some binding reactions. The amount of labeled protein bound to the 
cell surface does not decrease in binding reactions containing increasing amounts of unrelated 
5 unlabeled protein, indicating that the protein encoded by the cDNA binds specifically to the cell 
surface. 

As discussed herein, secreted proteins have been shown to have a number of important 
physiological effects and, consequently, represent a valuable therapeutic resource. The secreted 
proteins encoded by the cDNAs or fragments thereof made using any of the methods described 
10 therein may be evaluated to determine their physiological activities as described below. 

Assaying GENSET proteins or Fragments Thereof for Cytokine. Cell Proliferation or Cell 
Differentiation Activity 

Secreted proteins may act as cytokines or may affect cellular proliferation or differentiation. 
Many protein factors discovered to date, including all known cytokines, have exhibited activity in 

15 one or more factor dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of a protein of the present invention is evidenced by 
any one of a number of routine factor dependent cell proliferation assays for cell lines including, 
without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, MC9/G, M+ (preB M+), 2E8, RB5, 
DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7c and CMK. The proteins encoded by the cDNAs of 

20 the invention or fragments thereof may be evaluated for their ability to regulate T cell or thymocyte 
proliferation in assays such as those described above or in the following references, which are 
incorporated herein by reference: Current Protocols in Immunology . Ed. by J.E. Coligan et al., 
Greene Publishing Associates and Wiley-Interscience; Takai et al. J. Immunol. 137:3494-3500, 
1986. Bertagnolli etal.J. Immunol. 145:1706-1712, 1990. Bertagnolli et al., Cellular Immunology 

25 133:327-341, 1991. Bertagnolli, et al. J. Immunol. 149:3778-3783, 1992; Bowman et al 9 J. 
Immunol. 152:1756-1761, 1994. 

In addition, numerous assays for cytokine production and/or the proliferation of spleen 
cells, lymph node cells and thymocytes are known. These include the techniques disclosed in 
Current Protocols in Immunology . J.E. Coligan et al. Eds., Vol 1 pp. 3.12.1-3.12.14 John Wiley and 

30 Sons, Toronto. 1994; and Schreiber, R.D. Current Protocols in Immunology ., supra Vol 1 pp. 
6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be assayed for the ability to 
regulate the proliferation and differentiation of hematopoietic or lymphopoietic cells. Many assays 
for such activity are familiar to those skilled in the art, including the assays in the following 

35 references, which are incorporated herein by reference: Bottomly, K., Davis, L.S. and Lipsky, P.E., 
Measurement of Human and Murine Interleukin 2 and Interleukin 4, Current Protocols in 
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Immunology .. J.E. Coligan et al Eds. Vol 1 pp. 6.3.1-63.12, John Wiley and Sons, Toronto. 1991; 
deVries et al,J. Exp. Med. 173:1205-1211, 1991; Moreau et al, Nature 36:690-692, 1988; 
Greenberger et al.,Proc. Natl Acad. Sci. U.S.A. 80:2931-2938, 1983; Nordan, R., Measurement of 
Mouse and Human Interleukin 6 Current Protocols in Immunology. J.E. Coligan et al. Eds. Vol 1 
5 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991 ; Smith et al, Proc. Natl. Acad. Sci. U.S.A. 
83:1857-1861, 1986; Bennett, F., Giannotti, J., Clark, S.C. and Turner, K.J., Measurement of 
Human Interleukin 1 1 Current Protocols in Immunology . J.E. Coligan et al. Eds. Vol 1 pp. 6.15.1 
John Wiley and Sons, Toronto. 1991; Ciarletta, A., Giannotti, J., Clark, S.C. and Turner, K.J., 
Measurement of Mouse and Human Interleukin 9 Current Protocols in Immunology . J.E. Coligan et 

10 al, Eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

The proteins encoded by the cDNAs of the invention may also be assayed for their ability to 
regulate T-cell responses to antigens. Many assays for such activity are familiar to those skilled in 
the art, including the assays described in the following references, which are incorporated herein by 
reference: Chapter 3 {In vitro Assays for Mouse Lymphocyte Function), Chapter 6 (Cytokines and 

15 Their Cellular Receptors) and Chapter 7, (Immunologic Studies in Humans) in Current Protocols in 
Immunology , J.E. Coligan et al. Eds. Greene Publishing Associates and Wiley-lnterscience; 
Weinberger et al., Proa Natl. Acad. ScL USA 77:6091-6095, 1980; Weinberger et al, Eur. J. 
Immun. 11:405-411, 1981; Takai et al , J. Immunol. 137:3494-3500, 1986; Takai et al, J. Immunol 
140:508-512, 1988. 

20 Those proteins which exhibit cytokine, cell proliferation, or cell differentiation activity may 

then be formulated as pharmaceuticals and used to treat clinical conditions in which induction of 
cell proliferation or differentiation is beneficial. Alternatively, as described in more detail below, 
genes encoding these proteins or nucleic acids regulating the expression of these proteins may be 
introduced into appropriate host cells to increase or decrease the expression of the proteins as 

25 desired. 

Assaying GENSET proteins or Fragments Thereof for Activity as Immune System Regulators 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 
as immune regulators. For example, the proteins may be evaluated for their activity to influence 
thymocyte or splenocyte cytotoxicity. Numerous assays for such activity are familiar to those 

30 skilled in the art including the assays described in the following references, which are incorporated 
herein by reference: Chapter 3 {In vitro Assays for Mouse Lymphocyte Function 3.1-3.19) and 
Chapter 7 (Immunologic studies in Humans) in Current Protocols in Immunology . J.E. Coligan et 
al Eds, Greene Publishing Associates and Wiley-lnterscience; Herrmann et al, Proc. Natl Acad. 
Sci. USA 78:2488-2492, 1981; Herrmann et al,J. Immunol. 128:1968-1974, 1982; Handa etal.,J. 

35 Immunol 135:1564-1572, 1985; Takai et al, J. Immunol. 137:3494-3500, 1986; Takai et al, J. 
Immunol. 140:508-512, 1988; Herrmann etal,Proc. Natl Acad. Sci. USA 78:2488-2492, 1981; 
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Herrmann et ah, J. Immunol. 128:1968-1974, 1982; Handa et ah , J. Immunol. 135:1564-1572, 1985; 
Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bowman et ah , J. Virology 61:1992-1998; Takai et 
ah, J. Immunol. 140:508-512, 1988; Bertagnolli et ah, Cellular Immunology 133:327-341, 1991; 
Brown et ah, J. Immunol. 153:3079-3092, 1994. 
5 The proteins encoded by the cDNAs of the invention may also be evaluated for their effects 

on T-cell dependent immunoglobulin responses and isotype switching. Numerous assays for such 
activity are familiar to those skilled in the art, including the assays disclosed in the following 
references, which are incorporated herein by reference: Maliszewski, J. Immunol. 144:3028-3033, 
1990; Mond, J.J. and Brunswick, M Assays for B Cell Function: In vitro Antibody Production, Vol 
10 1 pp. 3.8.1-3.8.16 in Current Protocols in Immunology. J.E. Coligan et al Eds., John Wiley and 
Sons, Toronto. 1994. 

The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 
on immune effector cells, including their effect on Thl cells and cytotoxic lymphocytes. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 

15 following references, which are incorporated herein by reference: Chapter 3 (In vitro Assays for 
Mouse Lymphocyte Function 3.1-3.19) and Chapter 7 (Immunologic Studies in Humans) in Current 
Protocols in Immunology , J.E. Coligan et ah Eds., Greene Publishing Associates and Wiley- 
Interscience; Takai et ah, J. Immunol. 137:3494-3500, 1986; Takai et ah; J. Immunol. 140:508-512, 
1988; Bertagnolli et ah, J. Immunol. 149:3778-3783, 1992. 

20 The proteins encoded by the cDNAs of the invention may also be evaluated for their effect 

on dendritic cell mediated activation of naive T-cells. Numerous assays for such activity are 
familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Guery et ah, J. Immunol. 134:536-544, 1995; Inaba et ah, 
Journal of Experimental Medicine 173:549-559, 1991; Macatonia et ah, Journal of Immunology 

25 154:5071-5079, 1995; Porgador et ah, Journal of Experimental Medicine 182:255-260, 1995; Nair 
et ah, Journal of Virology 67:4062-4069, 1993; Huang et ah, Science 264:961-965, 1994; 
Macatonia et ah, Journal of Experimental Medicine 169: 1255-1264, 1989; Bhardwaj et ah, Journal 
of Clinical Investigation 94:797-807, 1994; and Inaba et ah, Journal of Experimental Medicine 
172:631-640, 1990. 

30 The proteins encoded by the cDNAs of the invention may also be evaluated for their 

influence on the lifetime of lymphocytes. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Darzynkiewicz et ah, Cytometry 13:795-808, 1992; Gorczyca et ah, Leukemia 
7:659-670, 1993; Gorczyca et ah, Cancer Research 53:1945-1951, 1993; Itoh et ah, Cell 66:233- 

35 243, 1991; Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et ah, Cytometry 
14:891-897, 1993; Gorczyca et ah, International Journal of Oncology 1:639-648, 1992. 
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Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al y Blood 84:1 1 1-1 17, 1994; Fine et al y 
Cellular immunology 155:1 1 1-122, 1994; Galy et ai, Blood 85:2770-2778, 1995; Toki et al. 9 Proc. 
Nat. Acad Sci. USA 88:7548-7551, 1991. 
5 Those proteins which exhibit activity as immune system regulators activity may then be 

formulated as pharmaceuticals and used to treat clinical conditions in which regulation of immune 
activity is beneficial. For example, the protein may be useful in the treatment of various immune 
deficiencies and disorders (including severe combined immunodeficiency (SCID)), e.g., in 
regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as effecting 

10 the cytolytic activity of NK cells and other cell populations. These immune deficiencies may be 
genetic or be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or may result from 
autoimmune disorders. More specifically, infectious diseases caused by viral, bacterial, fungal or 
other infection may be treatable using a protein of the present invention, including infections by 
HIV, hepatitis viruses, herpesviruses, mycobacteria, Leishmania spp., malaria spp. and various 

15 fungal infections such as candidiasis. Of course, in this regard, a protein of the present invention 
may also be useful where a boost to the immune system generally may be desirable, i.e., in the 
treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, autoimmune 
thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host disease and 
autoimmune inflammatory eye disease. Such a protein of the present invention may also to be 
useful in the treatment of allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is desired 

25 (including, for example, organ transplantation), may also be treatable using a protein of the present 
invention. 

Using the proteins of the invention it may also be possible to regulate immune responses, in 
a number of ways. Down regulation may be in the form of inhibiting or blocking an immune 
response already in progress or may involve preventing the induction of an immune response. The 

30 functions of activated T-cells may be inhibited by suppressing T cell responses or by inducing 

specific tolerance in T cells, or both. Immunosuppression of T cell responses is generally an active, 
non-antigen-specific, process which requires continuous exposure of the T cells to the suppressive 
agent. Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and persists after 

35 exposure to the tolerizing agent has ceased. Operationally, tolerance can be demonstrated by the 
lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent. 
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Down regulating or preventing one or more antigen functions (including without limitation 
B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high level lymphokine 
synthesis by activated T cells, will be useful in situations of tissue, skin and organ transplantation 
and in graft-versus-host disease (GVHD). For example, blockage of T cell function should result in 
5 reduced tissue destruction in tissue transplantation. Typically, in tissue transplants, rejection of the 
transplant is initiated through its recognition as foreign by T cells, followed by an immune reaction 
that destroys the transplant. The administration of a molecule which inhibits or blocks interaction 
of a B7 lymphocyte antigen with its natural ligand(s) on immune cells (such as a soluble, 
monomeric form of a peptide having B7-2 activity alone or in conjunction with a monomelic form 

10 of a peptide having an activity of another B lymphocyte antigen (e.g., B7-1, B7-3) or blocking 

antibody), prior to transplantation can lead to the binding of the molecule to the natural ligand(s) on 
the immune cells without transmitting the corresponding costimulatory signal. Blocking B 
lymphocyte antigen function in this matter prevents cytokine synthesis by immune cells, such as T 
cells, and thus acts as an immunosuppressant. Moreover, the lack of costimulation may also be 

15 sufficient to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term 
tolerance by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated 
administration of these blocking reagents. To achieve sufficient immunosuppression or tolerance in 
a subject, it may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular blocking reagents in preventing organ transplant rejection or 

20 GVHD can be assessed using animal models that are predictive of efficacy in humans. Examples of 
appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic 
pancreatic islet cell grafts in mice, both of which have been used to examine the 
immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et al. y 
Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 (1992). 

25 In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven Press, 
New York, 1989, pp. 846-847) can be used to determine the effect of blocking B lymphocyte 
antigen function in vivo on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 

30 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block costimulation of T 
cells by disrupting receptor ligand interactions of B lymphocyte antigens can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines which may be 

35 involved in the disease process. Additionally, blocking reagents may induce antigen-specific 

tolerance of autoreactive T cells which could lead to long-term relief from the disease. The efficacy 
of blocking reagents in preventing or alleviating autoimmune disorders can be determined using a 
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number of well-characterized animal models of human autoimmune diseases. Examples include 
murine experimental autoimmune encephalitis, systemic lupus erythmatosis in MRL/pr/pr mice or 
NZB hybrid mice, murine autoimmuno collagen arthritis, diabetes mellitus in OD mice and BB rats, 
and murine experimental myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, 
5 New York, 1 989, pp. 840-856). 

Upregulation of an antigen function (preferably a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response through stimulating B lymphocyte 
10 antigen function may be useful in cases of viral infection. In addition, systemic viral diseases such 
as influenza, the common cold, and encephalitis might be alleviated by the administration of 
stimulatory form of B lymphocyte antigens systemically. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed APCs 
15 either expressing a peptide of the present invention or together with a stimulatory form of a soluble 
peptide of the present invention and reintroducing the in vitro activated T cells into the patient. The 
infected cells would now be capable of delivering a costimulatory signal to T cells in vivo, thereby 
activating the T cells. 

In another application, up regulation or enhancement of antigen function (preferably B 

20 lymphocyte antigen function) may be useful in the induction of tumor immunity. Tumor cells (e.g., 
sarcoma, melanoma, lymphoma, leukemia, neuroblastoma, carcinoma) transfected with a nucleic 
acid encoding at least one peptide of the present invention can be administered to a subject to 
overcome tumor-specific tolerance in the subject. If desired, the tumor cell can be transfected to 
express a combination of peptides. For example, tumor cells obtained from a patient can be 

25 transfected ex vivo with an expression vector directing the expression of a peptide having B7-2-like 
activity alone, or in conjunction with a peptide having B7-l-like activity and/or B7-3-like activity. 
The transfected tumor cells are returned to the patient to result in expression of the peptides on the 
surface of the transfected cell. Alternatively, gene therapy techniques can be used to target a tumor 
cell for transfection in vivo. 

30 The presence of the peptide of the present invention having the activity of a B lymphocyte 

antigen(s) on the surface of the tumor cell provides the necessary costimulation signal to T cells to 
induce a T cell mediated immune response against the transfected tumor cells. In addition, tumor 
cells which lack MHC class I or MHC class II molecules, or which fail to reexpress sufficient 
amounts of MHC class I or MHC class II molecules, can be transfected with nucleic acids encoding 

35 all or a fragment of (e.g., a cytoplasmic-domain truncated fragment) of an MHC class I a chain 
protein and p 2 microglobulin protein or an MHC class 13 a chain protein and an MHC class II P 
chain protein to thereby express MHC class I or MHC class II proteins on the cell surface. 
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Expression of the appropriate class II or class II MHC in conjunction with a peptide having the 
activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T cell mediated immune 
response against the transfected tumor cell. Optionally, a gene encoding an antisense construct 
which blocks expression of an MHC class II associated protein, such as the invariant chain,can also 
5 be cotransfected with a DNA encoding a peptide having the activity of a B lymphocyte antigen to 
promote presentation of tumor associated antigens and induce tumor specific immunity. Thus, the 
induction of a T cell mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. Alternatively, as described in more detail below, genes 
encoding these proteins or nucleic acids regulating the expression of these proteins may be 
10 introduced into appropriate host cells to increase or decrease the expression of the proteins as 
desired. 

Assaying GENSET proteins or Fragments Thereof for Hematopoiesis Regulating Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their hematopoiesis regulating activity. For example, the effect of the proteins on 
15 embryonic stem cell differentiation may be evaluated. Numerous assays for such activity are 

familiar to those skilled in the art, including the assays disclosed in the following references, which 
are incorporated herein by reference: Johansson et al Cellular Biology 15:141-151, 1995; Keller et 
al, Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al, Blood 81 :2903-2915, 
1993. 

20 The proteins encoded by the cDNAs of the invention or fragments thereof may also be 

evaluated for their influence on the lifetime of stem cells and stem cell differentiation. Numerous 
assays for such activity are familiar to those skilled in the art, including the assays disclosed in the 
following references, which are incorporated herein by reference: Freshney, M.G. Methylcellulose 
Colony Forming Assays, in Culture of Hematopoietic Cells . R.I. Freshney, et al. Eds. pp. 265-268, 

25 Wiley-Liss, Inc., New York, NY. 1994; Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 1, 
1992; McNiece, I.K. and Briddell, R.A. Primitive Hematopoietic Colony Forming Cells with High 
Proliferative Potential, in Culture of Hematopoietic Cells . R.I. Freshney, et al. eds. Vol pp. 23-39, 
Wiley-Liss, Inc., New York, NY. 1994; Neben et al, Experimental Hematology 22:353-359, 1994; 
Ploemacher, R.E. Cobblestone Area Forming Cell Assay, In Culture of Hematopoietic Cells. R.I. 

30 Freshney, et al. Eds. pp. 1-21, Wiley-Liss, Inc., New York, NY. 1994; Spooncer, E., Dexter, M. and 
Allen, T. Long Term Bone Marrow Cultures in the Presence of Stromal Cells, in Culture of 
Hematopoietic Cells . R.I. Freshney, et al Eds. pp. 163-179, Wiley-Liss, Inc., New York, NY. 1994; 
and Sutherland, H.J. Long Term Culture Initiating Cell Assay, in Culture of Hematopoietic Cells . 
R.I. Freshney, et al. Eds. pp. 139-162, Wiley-Liss, Inc., New York, NY. 1994. 

35 Those proteins which exhibit hematopoiesis regulatory activity may then be formulated as 

pharmaceuticals and used to treat clinical conditions in which regulation of hematopoeisis is 
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beneficial. For example, a protein of the present invention may be useful in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell deficiencies. Even 
marginal biological activity in support of colony forming cells or of factor-dependent cell lines 
indicates involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
5 erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy to 
stimulate the production of erythroid precursors and/or erythroid cells; in supiporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., traditional 
CSF activity) useful, for example, in conjunction with chemotherapy to prevent or treat consequent 

10 myelo-suppression; in supporting the growth and proliferation of megakaryocytes and consequently 
of platelets thereby allowing prevention or treatment of various platelet disorders such as 
thrombocytopenia, and generally for use in place of or complimentary to platelet transfusions; 
and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of 
maturing to any and all of the above-mentioned hematopoietic cells and therefore find therapeutic 

15 utility in various stem cell disorders (such as those usually treated with transplantion, including, 
without limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or ex-vivo 
(i.e., in conjunction with bone marrow transplantation or with peripheral progenitor cell 
transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene 

20 therapy. Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Tissue Growth 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
25 evaluated for their effect on tissue growth. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in International Patent Publication No. 
WO95/16035, International Patent Publication No. WO95/05846 and International Patent 
Publication No. WO91/07491, which are incorporated herein by reference. 

Assays for wound healing activity include, without limitation, those described in: Winter, 
30 Epidermal Wound Healing , pps. 71-112 (Maibach, HI and Rovee, DT, eds.), Year Book Medical 
Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 71 :382-84 (1978) 
which are incorporated herein by reference. 

Those proteins which are involved in the regulation of tissue growth may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of tissue 
35 growth is beneficial. For example, a protein of the present invention also may have utility in 

compositions used for bone, cartilage, tendon, ligament and/or nerve tissue growth or regeneration, 
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as well as for wound healing and tissue repair and replacement, and in the treatment of burns, 
incisions and ulcers. 

A protein of the present invention, which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone fractures 
5 and cartilage damage or defects in humans and other animals. Such a preparation employing a 
protein of the invention may have prophylactic use in closed as well as open fracture reduction and 
also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic 
agent contributes to the repair of congenital, trauma induced, or oncologic resection induced 
craniofacial defects, and also is useful in cosmetic plastic surgery. 

10 A protein of this invention may also be used in the treatment of periodontal disease, and in 

other tooth repair processes. Such agents may provide an environment to attract bone-forming 
cells, stimulate growth of bone-forming cells or induce differentiation of progenitors of bone- 
forming cells. A protein of the invention may also be useful in the treatment of osteoporosis or 
osteoarthritis, such as through stimulation of bone and/or cartilage repair or by blocking 

1 5 inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes. 

Another category of tissue regeneration activity that may be attributable to the protein of the 
present invention is tendon/ligament formation. A protein of the present invention, which induces 
tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not 

20 normally formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to 
tendon or ligament tissue, as well as use in the improved fixation of tendon or ligament to bone or 
other tissues, and in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like 

25 tissue formation induced by a composition of the present invention contributes to the repair of 

congenital, trauma induced, or other tendon or ligament defects of other origin, and is also useful in 
cosmetic plastic surgery for attachment or repair of tendons or ligaments. The compositions of the 
present invention may provide an environment to attract tendon- or ligament-forming cells, 
stimulate growth of tendon- or ligament-forming cells, induce differentiation of progenitors of 

30 tendon- or ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal runnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

35 The protein of the present invention may also be useful for proliferation of neural cells and 

for regeneration of nerve and brain tissue, i.e., for the treatment of central and peripheral nervous 
system diseases and neuropathies, as well as mechanical and traumatic disorders, which involve 
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degeneration, death or trauma to neural cells or nerve tissue. More specifically, a protein may be 
used in the treatment of diseases of the peripheral nervous system, such as peripheral nerve injuries, 
peripheral neuropathy and localized neuropathies, and central nervous system diseases, such as 
Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
5 Drager syndrome. Further conditions which may be treated in accordance with the present 

invention include mechanical and traumatic disorders, such as spinal cord disorders, head trauma 
and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting from chemotherapy 
or other medical therapies may also be treatable using a protein of the invention. 

Proteins of the invention may also be useful to promote better or faster closure of non- 
10 healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

It is expected that a protein of the present invention may also exhibit activity for generation 
or regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium) muscle (smooth, skeletal or cardiac) and vascular (including vascular 
15 endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring to allow normal tissue to 
generate. A protein of the invention may also exhibit angiogenic activity. 

A protein of the present invention may also be useful for gut protection or regeneration and 
treatment of lung or liver fibrosis, reperfusion injury in various tissues, and conditions resulting 
20 from systemic cytokine damage. 

A protein of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Alternatively, as described in more detail below, genes encoding these proteins or nucleic 
25 acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Regulation of Reproductive Hormones or 
Cell Movement 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
30 evaluated for their ability to regulate reproductive hormones, such as follicle stimulating hormone. 
Numerous assays for such activity are familiar to those skilled in the art, including the assays 
disclosed in the following references, which are incorporated herein by reference: Vale et al y 
Endocrinology 91:562-572, 1972; Ling et al, Nature 321:779-782, 1986; Vale etal, Nature 
321:776-779, 1986; Mason et al, Nature 318:659-663, 1985; Forage et al, Proc. Natl Acad. Sci. 
35 USA 83:3091-3095, 1986. Chapter 6.12 (Measurement of Alpha and Beta Chemokines) Current 
Protocols in Immunology , J.E. Coligan et al Eds. Greene Publishing Associates and Wiley- 
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Intersciece ; Taub et al J. Clin, Invest. 95:1370-1376, 1995; Lind et al APMIS 103:140-146, 1995; 
Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; 
Johnston et al. J. of Immunol 153:1762-1768, 1994. 

Those proteins which exhibit activity as reproductive hormones or regulators of cell 
5 movement may then be formulated as pharmaceuticals and used to treat clinical conditions in which 
regulation of reproductive hormones or cell movement are beneficial. For example, a protein of the 
present invention may also exhibit activin- or inhibin-related activities. Inhibins are characterized 
by their ability to inhibit the release of follicle stimulating hormone (FSH), while activins are 
characterized by their ability to stimulate the release of folic stimulating hormone (FSH). Thus, a 

10 protein of the present invention, alone or in heterodimers with a member of the inhibin a family, 
may be useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient amounts of 
other inhibins can induce infertility in these mammals. Alternatively, the protein of the invention, 
as a homodimer or as a heterodimer with other protein subunits of the inhibin-B group, may be 

15 useful as a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating 
FSH release from cells of the anterior pituitary. See, for example, United States Patent 4,798,885, 
the disclosure of which is incorporated herein by reference. A protein of the invention may also be 
useful for advancement of the onset of fertility in sexually immature mammals, so as to increase the 
lifetime reproductive performance of domestic animals such as cows, sheep and pigs. 

20 Alternatively, as described in more detail below, genes encoding these proteins or nucleic 

acids regulating the expression of these proteins may be introduced into appropriate host cells to 
increase or decrease the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Chemotactic/Chemokinetic Activity 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 

25 evaluated for chemotactic/chemokinetic activity. For example, a protein of the present invention 
may have chemotactic or chemokinetic activity (e.g., act as a chemokine) for mammalian cells, 
including, for example, monocytes, fibroblasts, neutrophils, T-cells, mast cells, eosinophils, 
epithelial and/or endothelial cells. Chemotactic and chmokinetic proteins can be used to mobilize 
or attract a desired cell population to a desired site of action. Chemotactic or chemokinetic proteins 

30 provide particular advantages in treatment of wounds and other trauma to tissues, as well as in 

treatment of localized infections. For example, attraction of lymphocytes, monocytes or neutrophils 
to tumors or sites of infection may result in improved immune responses against the tumor or 
infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
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Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhension of one cell population 
to another cell population. Suitable assays for movement and adhesion include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J.E. Coligan, A.M. Kruisbeek, D.H. 
10 Margulies, E.M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience 
(Chapter 6.12, Measurement of alpha and beta Chemokincs 6.12.1-6.12.28; Taub et al J. Clin. 
Invest. 95:1370-1376, 1995; Lind et al APMIS 103:140-146, 1995; Mueller et al Eur. J. Immunol. 
25:1744-1748; Gruber et al J. of Immunol. 152:5860-5867, 1994; Johnston et al J. of Immunol, 
153:1762-1768, 1994. 

15 Assaying GENSET proteins or Fragments Thereof for Regulation of Blood Clotting 

The proteins encoded by the cDNAs of the invention or fragments thereof may also be 
evaluated for their effects on blood clotting. Numerous assays for such activity are familiar to those 
skilled in the art, including the assays disclosed in the following references, which are incorporated 
herein by reference: Linet et al , J. Clin. Pharmacol 26:131-140, 1986; Burdick et al , Thrombosis 

20 Res. 45:413-419, 1987; Humphrey et al, Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

Those proteins which are involved in the regulation of blood clotting may then be 
formulated as pharmaceuticals and used to treat clinical conditions in which regulation of blood 
clotting is beneficial. For example, a protein of the invention may also exhibit hemostatic or 

25 thrombolytic activity. As a result, such a protein is expected to be useful in treatment of various 
coagulations disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, surgery or other 
causes. A protein of the invention may also be useful for dissolving or inhibiting formation of 
thromboses and for treatment and prevention of conditions resulting therefrom (such as, for 

30 example, infarction of cardiac and central nervous system vessels (e.g., stroke)). Alternatively, as 
described in more detail below, genes encoding these proteins or nucleic acids regulating the 
expression of these proteins may be introduced into appropriate host cells to increase or decrease 
the expression of the proteins as desired. 

Assaying GENSET proteins or Fragments Thereof for Involvement in Receptor/Li gand Interactions 

35 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for their 

involvement in receptor/ligand interactions. Numerous assays for such involvement are familiar to 
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those skilled in the art, including the assays disclosed in the following references, which are 
incorporated herein by reference: Chapter 7.28 (Measurement of Cellular Adhesion under Static 
Conditions 7.28.1-7.28.22) in Current Protocols in Immunology , J.E. Coligan et al Eds. Greene 
Publishing Associates and Wiley-Interscience; Takai et al, Proc. Natl Acad. Sci. USA 84:6864- 
5 6868, 1987; Bierer et al. 9 J. Exp. Med. 168:1 145-1 156, 1988; Rosenstein et al, J. Exp. Med. 
169:149-160, 1989; Stoltenborg etal. 9 J. Immunol. Methods 175:59-68, 1994; Stitt etal. y Cell 
80:661-670, 1995;Gyuris et al, Cell 75:791-803, 1993. 

For example, the proteins of the present invention may also demonstrate activity as 
receptors, receptor ligands or inhibitors or agonists of receptor/ligand interactions. Examples of 

10 such receptors and ligands include, without limitation, cytokine receptors and their ligands, receptor 
kinases and their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell 
interactions and their ligands (including without limitation, cellular adhesion molecules (such as 
selectins, integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, 
antigen recognition and development of cellular and humoral immune respones). Receptors and 

15 ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

Assaying GENSET proteins or Fragments Thereof for Anti -Inflammatory Activity 
20 The proteins encoded by the cDNAs or a fragment thereof may also be evaluated for anti- 

inflammatory activity. The anti-inflammatory activity may be achieved by providing a stimulus to 
cells involved in the inflammatory response, by inhibiting or promoting cell-cell interactions (such 
as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the 
inflammatory process, inhibiting or promoting cell extravasation, or by stimulating or suppressing 
25 production of other factors which more directly inhibit or promote an inflammatory response. 

Proteins exhibiting such activities can be used to treat inflammatory conditions including chronic or 
acute conditions), including without limitation inflammation associated with infection (such as 
septic shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia- 
reperfusioninury, endotoxin lethality, arthritis, complement-mediated hyperacute rejection, 
30 nephritis, cytokine or chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease 
or resulting from over production of cytokines such as TNF or IL-1. Proteins of the invention may 
also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 

Assaying GENSET proteins or Fragments Thereof for Tumor Inhibition Activity 

The proteins encoded by the cDNAs of the invention or a fragment thereof may also be 
35 evaluated for tumor inhibition activity. In addition to the activities described above for 

immunological treatment or prevention of tumors, a protein of the invention may exhibit other anti- 
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tumor activities. A protein may inhibit tumor growth directly or indirectly (such as, for example, 
via ADCC). A protein may exhibit its tumor inhibitory activity by acting on tumor tissue or tumor 
precursor tissue, by inhibiting formation of tissues necessary to support tumor growth (such as, for 
example, by inhibiting angiogenesis), by causing production of other factors, agents or cell types 
5 which inhibit tumor growth, or by suppressing, eliminating or inhibiting factors, agents or cell types 
which promote tumor growth. 

A protein of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing or 

10 enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape (such 
as, for example, breast augmentation or diminution, change in bone form or shape); effecting 
biorhythms or circadian cycles or rhythms; effecting the fertility of male or female subjects; 
effecting the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 

15 dietary fat, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional factors or 
component(s); effecting behavioral characteristics, including, without limitation, appetite, libido, 
stress, cognition (including cognitive disorders), depression (including depressive disorders) and 
violent behaviors; providing analgesic effects or other pain reducing effects; promoting 
differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages; 

20 hormonal or endocrine activity; in the case of enzymes, correcting deficiencies of the enzyme and 
treating deficiency-related diseases; treatment of hyperproliferative disorders (such as, for example, 
psoriasis); immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an immune 
response against such protein or another material or entity which is cross-reactive with such protein. 
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Table I 



Seq Id No 


Internal designation 


Type 


Vector 


1 


11 9-003 -4-0-C2-CS 


DNA 


pBluescriptll SK- 


2 


105-01 6-1 -0-D3-CS 


DNA 


pBluescriptll SK- 


3 


105-016-3-0-G10-CS 


DNA 


pBluescriptll SK- 


4 


1 05-026- 1-0-A5-CS 


DNA 


pBluescriptll SK- 


5 


1 05-03 1-1-0-All-CS 


DNA 


pBluescriptll SK- 


6 


105-031 -2-0-D3-CS 


DNA 


pBluescriptll SK- 


7 


105-035-2-0-C6-CS 


DNA 


pBluescriptn SK- 


! 8 


105-037-2-0-H11-CS 


DNA 


pBluescriptll SK- 


9 


105-053-4-0-E8-CS 


DNA 


pBluescriptll SK- 


10 


105-074-3-0-H10-CS 


DNA 


pBluescriptll SK- 


11 


105-089-3-0-G10-CS 


DNA 


pBluescriptll SK- 


12 


105-095-2-0-G11-CS 


DNA 


pBluescriptll SK- ! 


13 


1 06-006- 1-0-E3-CS 


DNA 


pBluescriptll SK- 


14 


106-03 7- 1-0-E9-CS .cor 


DNA 


pBluescriptll SK- 


15 


106-037-l-0-E9-CS.fr 


DNA 


pBluescriptll SK- 


16 


106-043-4-0-H3-CS 


DNA 


pBluescriptll SK- 


17 


11 0-007- 1-0-C7-CS 


DNA 


pBluescriptll SK- 


18 


1 14-016-1 -0-H8-CS 


DNA 


pBluescriptll SK- 


19 


1 1 6-004-3-0-A6-CS 


DNA 


pBluescriptll SK- 


20 


116-054-3-0-E6-CS 


DNA 


pBluescriptll SK- 


21 


11 6-055- 1-0-A3-CS 


DNA 


pBluescriptll SK- 


22 


11 6-055-2 -0-F7-CS 


DNA 


pBluescriptll SK- 


23 


116-088-4-0-A9-CS 


DNA 


pBluescriptll SK- 


24 


116-091-1-0-D9-CS 


DNA 


pBluescriptll SK- 


25 


116-110-2-0-F4-CS 


DNA 


pBluescriptll SK- 


26 


116-11 1-1 -0-H9-CS 


DNA 


pBluescriptll SK- 


27 


116-1 11 -4-0-B3-CS 


DNA 


pBluescriptll SK- j 


28 


116-115-2-0-F8-CS 


DNA 


pBluescriptll SK- 


29 


116-119-3-0-H5-CS 


DNA 


pBluescriptll SK- 


30 


117-001-5-0-G3-CS 


DNA 


pBluescriptll SK- 


31 


145-25-3-0-B4-CS.cor 


DNA 


pBluescriptB SK- 


32 


145-25-3-0-B4-CS.fr 


DNA 


pBluescriptll SK- 


33 


| 145-56-3-0-D5-CS 


DNA 


pBluescriptll SK- 


34 


145-59-2-0-A7-CS 


DNA 


pBluescriptll SK- 


35 


157-15-4-0-B11-CS 


DNA 


pBluescriptll SK- ! 


36 


160-103-1-0-Fll-CS 


DNA 


pBluescriptll SK- | 


37 


160-37-2-0-H7-CS 


DNA 


pBluescriptll SK- 


38 


| 160-58-3-0-H3-CS 


DNA 


pBluescriptll SK- 


39 


160-75-4-0-A9-CS 


DNA 


^Bluescriptll SK- 


i 40 


174-10-2-0-F8-CS 


DNA 


pPT 


| 41 


174-33-3-0-F6-CS 


DNA 


pPT 


42 


1 74-38- 1-0-B6-CS 


DNA 


PPT 


43 


174-38-3-0-C9-CS 


DNA 


pPT 


44 


174-39-2-0-A3-CS 


DNA 


pPT 


45 


174-41-1-0-A6-CS 


DNA 


pPT 


46 


174-5-3-0-H7-CS 


DNA 


pPT 


47 


174-7-4-0-H1-CS 


DNA 


pPT 


48 


175-l-3-0-E5-CS.cor 


DNA 


pPT 



477 



BNSDOCID: <WO 0142451A2 I > 



WO 01/42451 



PCT/1B00/01938 



49 


175-l-3-0-E5-CS.fr 


DNA 


pPT 


50 


180-19-4-0-F4-CS 


DNA 


pBluescriptll SK- 


51 


181-10-1-0-DlO-CS 


DNA 


pBluescriptll SK- 


52 


181-16-1-0-G7-CS 


DNA 


pBluescriptll SK- 


\ 53 


181-16-2-0-A7-CS 


DNA 


pBluescriptll SK- 


54 


181-20-3-0-B5-CS 


DNA 


pBluescriptll SK- 


55 


181-3-3-0-B8-CS 


DNA 


pBluescriptll SK- 


1 56 


181-3-3-0-C9-CS 


DNA 


pBluescriptll SK- 


57 


182-1-2-0-D12-CS 


DNA 


pBluescriptll SK- 


58 


184-1-4-0-C11-CS 


DNA 


pBluescriptll SK- 


59 


1 84-4-1 -0-A11-CS 


DNA 


pBluescriptll SK- 


60 


187-12-4-0-A8-CS 


DNA 


pBluescriptll SK- 


1 61 


187-2-2-0-A3-CS 


DNA 


pBluescriptll SK- 


62 


187-31-0-0-fl2-CS 


DNA 


pBluescriptll SK- 


63 


1 87-34-0-0-1 12-CS 


DNA 


pBluescriptll SK- 


64 


187-37-0-0-c 10-CS 


DNA 


pBluescriptll SK- 


65 


187-3 8-0-0-1 10-CS 


DNA 


pBluescriptll SK- 


66 


1 87-3 9-0-0-k 12-CS 


DNA 


pBluescriptll SK- 


67 


187-41-0-0-i21-CS 


DNA 


pBluescriptll SK- 
* * 


68 


188-1 1-1 -0-B3-CS 


DNA 


pBluescriptll SK- 


69 


188-18-4-0-A9-CS 


DNA 


pBluescriptll SK- 


70 


188-28-4-0-B12-CS.cor 


DNA 


pBluescriptll SK- 


71 


188-28-4-0-B12-CS.fr 


DNA 


pBluescriptll SK- 


72 


188-28-4-0-D4-CS 


DNA 


pBluescriptll SK- 


73 


188-41-l-0-B8-CS.cor 


DNA 


pBluescriptll SK- 


74 


188-41-l-0-B8-CS.fr 


DNA 


pBluescriptll SK- 


75 


1 88-45-1 -0-D9-CS 


DNA 


pBluescriptll SK- 


76 


188-9-2-0-E1-CS 


DNA 


pBluescriptll SK- 


77 


105-079-3-0-A11-CS 


DNA 


pBluescriptll SK- 


78 


i 105-092-1-0-H7-CS 


DNA 


pBluescriptll SK- 


79 


105-141 -4-0-H9-CS 


DNA 


pBluescriptll SK- 


80 


109-013-1-0-B9-CS 


DNA 


pBluescriptll SK- 


81 


110-008-4-0-D9-CS 


DNA 


i pBluescriptll SK- 


82 


11 4-001 -3-0-A2-CS 


DNA 


j pBluescriptll SK- 


83 


11 4-028-2 -0-C1-CS 


DNA 


1 pBluescriptll SK- 


84 


11 4-032- 1-0-H 10-CS 


DNA 


pBluescriptll SK- 


85 


11 4-043 -2-0- A 10-CS 


DNA 


pBluescriptll SK- 


86 


1 14-044-1 -0-C5-CS 


DNA 


pBluescriptll SK- 


87 


116-003-3-0-D 10-CS 


DNA 


pBluescriptll SK- 


88 


116-003-3-0-G12-CS 


DNA 


pBluescriptll SK- 


89 


1 16-01 1-2-0-F11-CS 


DNA 


pBluescriptll SK- 


90 


116-033-3-0-E4-CS 


DNA 


pBluescriptll SK- 


91 


116-041-4-0-B6-CS 


DNA 


pBluescriptll SK- 


92 


116-044-2-0-C4-CS 


DNA 


pBluescriptll SK- 


93 


1 11 6-075 -1-0-E6-CS 


DNA 


pBluescriptll SK- 


94 


H6-094-4-0-G5-CS 


DNA 


pBluescriptll SK- 


95 


117-005-3-0-F2-CS 


DNA 


pBluescriptll SK- 


96 


121-007-3 -0-D9-CS 


DNA 


pBluescriptll SK- 


97 


145-9 1-3-0-D10-CS 


DNA 


pBluescriptll SK- 


98 


157-17-1-0-F4-CS 


DNA 


pBluescriptll SK- 


99 


160-11 -3 -0-G8-CS 


DNA 


pBluescriptll SK- 


100 


1 60-24- 1-0-F 12-CS 


DNA 


j pBluescriptll SK- 
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101 ! 


1 60-24-2 -0-E9-CS 


DNA 


pBluescriptll SK- 


102 


160-25-4-0-D2-CS 


DNA 


pBluescriptll SK- 


103 


160-31-3-0-A11-CS 


DNA 


pBluescriptn SK- 


104 


1 60-32- 1-0-F6-CS 


DNA 


pBluescriptll SK- 


105 


1 60-37- 1-0-A3-CS 


DNA 


pBluescriptll SK- 


106 


1 60-40-3 -0-E9-CS 


DNA 


pBluescriptll SK- 


107 


160-58-3-0-E4-CS 


DNA 


pBluescriptll SK- 


108 


160-85-3-0-D4-CS 


DNA 


pBluescriptll SK- j 


109 


160-95-3-0-A11-CS 


DNA 


pBluescriptll SK- 


1 10 


162-10-4-0-F9-CS cor 


DNA 


pBluescriptll SK- 


1 1 1 


162-10-4-0-F9-CS.fr 


DNA 


pBluescriptll SK- 


1 12 


174-13-2-0-E4-CS 


DNA 


pPT 
* 


113 


1 74-46-2 -0-B1 1-CS 


DNA 


pPT 


114 


179-8-2-0-A6-CS 


DNA 


pBluescriptll SK- 


115 


180-22-3-0-B6-CS 


DNA 


pBluescriptll SK- 


116 


181-13-1-0-F7-CS 


DNA 


pBluescriptll SK- 


: 117 


181-15-4-0-F7-CS 


DNA 


pBluescriptll SK- 


118 


1 81-20- 1-0-G7-CS 


DNA 


pBluescriptll SK- 


119 


184-15-3-0-D1-CS 


DNA 


pBluescriptll SK- 


120 


187-1 2-2 -0-G1 1-CS 


DNA 


pBluescriptll SK- 


121 


1 87-2-2-0- A 12-CS 


DNA 


pBluescriptll SK- 


122 


1 87-30-0-0-k23-CS 

l u / — / V-/ vy \J _y v — ' k-J 


DNA 


DBluescriDtll SK- 


121 


1 87-16-0-0-e 1 9-CS 


DNA 


DBluescriotll SK- 


124 


1 R7-18-0-0-d22-CS 


DNA 


nBluescriDtll SK- 

Ly A_y a uv Jv« j. Ly in t^y Am 




1 87-19-0-0-h9-CS 


DNA 


nBluescriotll SK- 


i 126 


1 87-19-0-0-p6-CS 


DNA 


pBluescriptll SK- 


i 127 


1 87-45-0-0-1 1 8-CS 


DNA 


pBluescriptll SK- 


1^.0 


1 R7-45-0-0-m2 1 -CS 

1 O / ^r«»/ Vy Vy 1 1 1£- I V_y kJ 


DNA 


nBluescriDtll SK- 

L/iy i ucovi i Ly iai iiJiv 


129 


1 87-45-0-0-n8-CS 


DNA 


DBluescriDtll SK- 

L/iy iuv jvi i wy ij a l_j a 


1 JU 


1 87-46-0-0-f23-CS 


DNA 


pBluescriptll SK- 


! Ill 


1 87-^-1 -0-A12-CS 


DNA 


nBluescriDtll SK- 

uiy iuvovi iLyixx kyiv 




1 87-S-l -0-F6-CS 


DNA 


nBluescriDtll SK- 


133 


1 87-5-2 -0-B2-CS 

1 O / «y *m \J LJ V»^kJ 


DNA 


DBluescriotll SK- 

L/iy 1U v Jvl ipLAl k^y a ^ 


114 


187-5-3-0-D5-CS 

l o / ~y _J vy x~y «y v_x kj 


DNA 


pBluescriptll SK- 


135 


187-51-0-0-f9-CS 

i vy / «y x v/ v/ x -y v/u 


DNA 


pBluescriptll SK- 


136 


1 87-6-1 -0-B9-CS 

x vy / vy i vy mss 


DNA 


pBluescriptll SK- 


137 


187-6-4-0-C10-CS 

i u f vy r vy x vy \yiy 


DNA 


pBluescriptll SK- 


138 


188-19-2-0-C8-CS 

i vy vy i s x-* vy vy v_/u 


DNA 


pBluescriptll SK- 


139 


188-22-4-0-G6-CS 

i U u ^— ^— » vy vj vy ky 


DNA 


pBluescriptll SK- 


140 


188-28-4-0-D11-CS 

v vy vy A— kj * vy 1 ^ A A X— ' fc_y 


DNA 


pBluescriptll SK- 


141 


1 88-29-1 -0-E10-CS 


DNA 


pBluescriptll SK- 


142 


188-34-4-0-E5-CS 


DNA 


pBluescriptll SK- 


143 


188-9-3-0-A5-CS 


DNA 


pBluescriptll SK- 


144 


1 05-021 -3-0-C3-CS 


DNA 


i pBluescriptll SK- 


145 


105-037-4-0-H12-CS 


DNA 


pBluescriptll SK- 


146 


105-073-2-0-A7-CS 


DNA 


pBluescriptll SK- 


147 


1 09-002-4-0-C6-CS 

& vy ^y vy v*» r \y vy v*^ ^y 


DNA 


pBluescriptll SK- 


148 


109-003-1 -0-G4-CS 


I DNA 


pBluescriptn SK- 


149 


116-118-4-0-A8-CS 


DNA 


pBluescriptll SK- 


150 


145-52-2-0-D 12-CS 


DNA 


pBluescriptll SK- 


151 


1 145-7-2-0-G5-CS 


DNA 


pBluescriptll SK- 


152 


145-7-3-0-D3-CS 


DNA 


pBluescriptll SK- 
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153 


157-1 7-2 -0-C1-CS 


DNA 


pBluescriptll SK- 


154 


160-1 01 -3-0-H2-CS 


DNA 


pBluescriptll SK- 


155 


160-12-1-0-DlO-CS 


DNA 


pBluescriptll SK- 


156 


160-28-4-0-C4-CS 


DNA 


pBluescriptll SK- 


157 


160-31-3-0-E4-CS 


DNA 


pBluescriptll SK- 


158 


1 60-40-1 -0-H4-CS 


DNA 


pBluescriptll SK- 


159 


1 60-54-1 -0-F7-CS 


DNA 


pBluescriptll SK- 


160 


1 60-88-3-0-A8-CSxor 


DNA 


pBluescriptll SK- 


I 161 


160-88-3-0-A8-CS.fr 


DNA 


pBluescriptll SK- 


162 


160-99-4-0-E4-CS 


DNA 


pBluescriptll SK- 


163 


161-5-4-0-B6-CS 


DNA 


pBluescriptll SK- 


164 


174-1 7-1 -0-D6-CS 


DNA 


pPT ! 


165 


174-32-4-0-F8-CS 


DNA 


pPT 
* 


166 


174-38-4-0-D11-CS 


DNA 


pPT 


167 


174-8-2-0-C10-CS 


DNA 


pPT 


168 


179-14-2-0-F1 1-CS ' 


DNA 


pBluescriptll SK.- 


169 


179-9-4-0-B8-CS 


DNA 


pBluescriptll SK- 


170 


181-10-1-0-C9-CS 


DNA 


pBluescriptll SK- 


171 


1 87-5-3-0-C7-CS 


DNA 


pBluescriptll SK- 


172 


1 88-26-4-0-F5-CS 

1 O ty a. V/ f vy a w» 


DNA 


pBluescriptll SK- 


171 


188-27-3-0-G1-CS 


DNA 


pBluescriptll SK- 


1 74 

1 / *T 


1 88-29-2 -0-H 1-CS 


DNA 


pBluescriptll SK- 


i / j 


188-31-1 -0-E6-CS 


DNA 


pBluescriptll SK- 


1 76 


1 88-45-1 -0-D3-CS 


DNA 


pBluescriptll SK- 


1 77 
i / / 


1 88-5-1 -0-H6-CS 

lUO «-» A Vy AA VJ V^>kJ 


DNA 


pBluescriptll SK- 


178 

I/O 


1 88-9-1 -0-C10-CS 


DNA 


pBluescriptll SK- 


1 79 

i / ,7 


1 05-0 1 6-3-0-C5-CS 

A V*r Vy A V-/ _y V \^ky 


DNA 


pBluescriptll SK- 


1 80 
1 ou 


1 0S-026-4-0-D9-CS 


DNA 


DBluescriotll SK- 


181 

1 O 1 


1 OS-OS3-2-0-D9-CS 


DNA 


pBluescriptll SK- 


1 82 


105-069-3-0-A1 1-CS 

A \J *J \J\J ^ ~s V AAA A kJ 


DNA 


pBluescriptll SK- 


1 81 


1 0S-O76-4-0-F6-CS 


DNA 


DBluescriDtll SK- 


1 84 

1 0*T 


10S-1 15-2-0-F9-CS 


DNA 


pBluescriptll SK- 


1 8S 


1 06-023 -4-0-F6-CS 

1 V/Vy \J *m *S T V A V/ V**ky 


DNA 


pBluescriptll SK- 


1 86 


1 1 0-00 1 -3 -0-C 1 1 -CS 

X X \J V/Vy 1 V/ V^-^ A A k_7 


DNA 


pBluescriptll SK- 


1 87 

1 o / 


1 1 0-002-3 -0-F9-CS 

A A V/ V/ V A. «y V A k_» 


DNA 


pBluescriptll SK- 


1 88 


1 14-019-3-0-D9-CS 


DNA 


pBluescriptll SK- 


189 


! 1 1 4-029- 1-0-C6-CS 


DNA 


pBluescriptll SK- 


190 


i 1 14-032-4-0-B1-CS 


DNA 


pBluescriptll SK- 


191 


1 1 4-070-2 -0-H4-CS 


DNA 


pBluescriptll SK- 


192 


1 16-0 16-3 -0-F1 1-CS 


DNA 


pBluescriptll SK- 


193 


1 16-022-4-0-G2-CS 


DNA 


pBluescriptll SK- 


194 


11 6-052-2 -0-H8-CS 


DNA 


pBluescriptll SK- 


195 


116-053-4-0-B4-CS 


DNA 


pBluescriptll SK- 


196 


11 6-094-3 -0-H2-CS 


DNA 


pBluescriptll SK- 


197 


116-1 12-4-0-C7-CS 


DNA 


pBluescriptll SK- 


i 198 


1 16-123-3-0-F12-CS 


DNA 


pBluescriptll SK- 


199 


1 23-008-1 -0-C5-CS 

a M*mJ vy vy vy a v x^vy 


DNA 


pBluescriptll SK- 


200 


1 45-53-2 -0-H8-CS 


DNA 


pBluescriptll SK- 


201 


! 145-57-2-0-C9-CS.cor 


DNA 


pBluescriptll SK- 


202 


145-57-2-0-C9-CS.fr 


DNA 


pBluescriptll SK- 


203 


145-7-3-0-B12-CS 


DNA 


pBluescriptn SK- 


204 


157-12-2-0-D1-CS 


DNA 


pBluescriptll SK- ! 



480 



BNSDOCID: <WO_0142451A2J_> 



WO 01/42451 



PCT/1B00/01938 



205 


157-16-2-0-D5-CS 


DNA 


pBluescriptll SK- 


206 


157-1 8-2-0- A7-CS 


DNA 


pBluescriptll SK- 


207 


160-103-1-0-BlO-CS 


DNA 


pBluescriptll SK- 


208 


160-104-4-0-F3-CS 


DNA 


pBluescriptll SK- 


209 


160-22-2-0-D10-CS 


DNA 


pBluescriptn SK- 


! 210 


1 60-24-3 -0-F12-CS 


DNA 


pBluescriptll SK- 


211 


160-3-2-0-H3-CS 


DNA 


pBluescriptll SK- 


212 


160-58-2-0-A2-CS 


DNA 


pBluescriptll SK- 


213 


1 60-73-1 -0-B4-CS 


DNA 


pBluescriptll SK- 


| 214 


1 60-75 -4-0-E6-CS 


DNA 


pBluescriptll SK- 


215 


160-97-3-0-E9-CS 


DNA 


pBluescriptll SK- 


216 


174-1-4-0-E9-CS 


DNA 


pPT 


217 


174-1 2-4-0-C2-CS 


DNA 


pPT I 


218 


180-19-4-0-H2-CS 


DNA 


pBluescriptll SK- j 


219 


181-10-4-0-G12-CS 


DNA 


pBluescriptll SK- 


220 


18 1-3-2 -0-F6-CS 


DNA 


pBluescriptll SK- 


221 


181-4-4-0-A12-CS 


DNA 


pBluescriptll SK- 


222 


181-9-2-0-F12-CSxor 


DNA 


pBluescriptll SK- 


i 223 


181-9-2-0-F12-CS.fr 


DNA 


pBluescriptll SK- 


224 


184-13-3-0-E1 1-CS 


DNA 


pBluescriptll SK- 


225 


1 84-4-2 -0-D3-CS 


DNA 


pBluescriptll SK- 


226 


1 84-7-1 -0-E7-CS 


DNA 


pBluescriptll SK- 


227 


184-8-4-0-G9-CS 


DNA 


pBluescriptll SK- 


228 


187-10-3-0-G9-CS 


DNA 


pBluescriptll SK- 


229 


187-32-0-0-m20-CS 


DNA 


pBluescriptll SK- 


230 


1 87-32-0-0-n2 1 -CS.cor 


DNA 


pBluescriptll SK- 


231 


1 87-32-0-0-n2 1 -CS.fr 


DNA 


pBluescriptll SK- 


232 


187-4-2-0-E6-CS 


DNA 


pBluescriptll SK- 


233 


187-40-0-0-ilS-CS 


DNA 


pBluescriptll SK- 


234 


187-47-0-0-e24-CS 


DNA 


pBluescriptll SK- 


235 


1 87-9-3 -0-A2-CS 


DNA 


pBluescriptll SK- 


236 


188-26-4-0-H1-CS 


DNA 


pBluescriptll SK- 


237 


188-35-3-0-G9-CS 


DNA 


pBluescriptll SK- 


238 


188-38-4-0-D8-CS 


DNA 


pBluescriptll SK- 


239 


1 88-41 -1-0-E6-CS 


DNA 


pBluescriptll SK- 


240 


1 88-42-2 -0-F3-CS.cor 


DNA 


pBluescriptll SK- 


241 


188-42-2-0-F3-CS.fr 


DNA 


pBluescriptll SK- 


242 


119-003-4-0-C2-CS 


PRT 


pBluescriptll SK- 


243 


105-016-1 -0-D3-CS 


PRT 


pBluescriptll SK- 


244 


105-016-3-0-G10-CS 


PRT 


pBluescriptll SK- 


245 


105-026-1 -0-A5-CS 


PRT 


pBluescriptll SK- 


246 


1 05-03 1-1-0-A1 1-CS 


PRT 


pBluescriptn SK- 


247 


105-03 1-2-0-D3-CS 


PRT 


pBluescriptll SK- 


248 


105-035-2-0-C6-CS 


PRT 


pBluescriptll SK- 


249 


105-037-2-0-HU-CS 


PRT 


pBluescriptll SK- 


250 


105-053-4-0-E8-CS 


PRT 


pBluescriptll SK- 


251 


I 105-074-3-0-H10-CS 


PRT 


pBluescriptll SK- 


252 


105-089-3-0-G10-CS 


PRT 


pBluescriptll SK- 


253 


1 05-095-2 -0-G1 1-CS 


PRT 


pBluescriptll SK- 


254 


1 06-006-1 -0-E3-CS 


PRT 


pBluescriptll SK- 


s 255 


106-037-l-0-E9-CS.cor 


PRT 


pBluescriptll SK- 


256 


106-037-l-0-E9-CS.fr 


PRT 


pBluescriptll SK- 



481 



BNSDOCID: <WO_0142451A2J_» 



WO 01/42451 



PCT/1B00/01938 



257 


1 06-043-4-0-H3-CS 

1 \I\J \J i *J i x/ .1 .1 _J X>h_? 


PRT 


pBluescriptll SK- 


258 


1 1 0-007- 1-0-C7-CS 


PRT 


pBluescriptll SK- 


259 


1 14-01 6-1 -0-H8-CS 


PRT 


pBluescriptll SK- 


260 


1 1 6-004-3 -0-A6-CS 


PRT 


pBluescriptll SK- 


261 


1 16-054-3-0-E6-CS 


PRT 


pBluescriptll SK- 


262 

Am \JAm 


1 1 6-055-1 -0-A3-CS 


PRT 


pBluescriptll SK- 


263 


1 16-055-2-0-F7-CS 


PRT 


pBluescriptll SK- 


264 


1 16-088-4-0-A9-CS 


PRT 


^pBluescriptll SK- 


265 


1 1 6-09 1-1 -0-D9-CS 

1 1 v v s i 1 V/ S V— / kj 


PRT 


pBluescriptll SK- 




116-1 10-2-0-F4-PS 


PRT 


oBluescriDtll SK- 


267 


1 16-1 1 1-1-0-H9-CS 


PRT 


pBluescriptll SK- 




116-11 1-4-0-B3-CS 


PRT 


DBluescriDtll SK- 


969 


1 1 6-1 1 S-2-0-F8-PS 


PRT 


DBluescriDtll SK- 


770 

Am 1 \J 


1 16-1 19-3-0-H5-CS 


PRT 


DBluescriDtll SK- 


271 

I Z / 1 


1 1 7-001 -S-0-G3-CS 


PRT 


DBluescriDtll SK- 


010 
Z 1 Z 


14S-2S-3-0-R4-PS cor 


PRT 


DBluescriDtll SK- 


01% 


14S-25-3-0-R4-CS fr 


PRT 


DBluescriDtll SK- 


Old 

Z 1 H 


i 4S- < if>-3-o-ns-PS 


PRT 


DBluescriDtll SK- 


1 97^ 
AID 


14S-S0-9 -0-A7-P9 


PRT 


nBlnescrintll SK- 


01 £* 
Z /D 


1 S7-1 ^-4-0-FM 1 -PS 

1 J r*l J~H-\J-D I 1 -x^*J 


PRT 

I IX 1 


nRhip^rrintlT SK- 


Oil 

Am 1 1 


1 60-1 03-1 -0-F1 1 -PS 


PRT 


nRhiescriDtll SK- 


019. 
z / o 


1 £0-17-9-0-147 -PS 


PRT 

I 1\ X 


nRhiescrintTl SK- 


97Q 


1 60-S8-3-0-H3-PS 


PRT 


nT31iie«;crintll SK- 


980 


1 60-7S-4-0. AO-PS 


PRT 

1 IX A 


nRhie<;rrintn SK- 

VJU 1 UtOvl 1 \J IXX kJlX 


ZO 1 


1 74-1 0-9 -0-F8-PS 


PRT 

A IX I 


dPT 
pr i 


989 

ZoZ 


1 74-33-3-0-F6-PS 


PRT 


dPT 
pr i 


98^ 

ZoO 


1 74-38-1 -0-R6-PS 


PRT 

i. XX X 


dPT 


9 8d 


1 74-^8-^-0-PO-PS 


PRT 

X XX I 


dPT 
pr i 


98^ 


1 74-^0-9-0- A 1 -PS 


PRT 

T XX X 


dPT 
pr i 


986 
ZoO 


1 74-4 1 -1 -0- A6-PS 


PRT 

X XX I 


dPT 


987 
Zo / 


1 74-S-1-0-H7-PS 


PRT 

X XX X 


dPT 
pr i 


988 

ZOO 


1 74-7-4-0-H1 -PS 


PRT 

X XX 1 


dPT 1 
pr i 


980 
zoy 


1 75-1 -3-0-FS-PS ror 


PRT 

X XX X 


dPT 
pr i 


9Q0 
z^w 


1 7S-1 -3-0-F5-PS fr 


PRT 


dPT 
pr i 


9Q1 

A.Zf 1 


1 80-1 0-4-0-F4-PS 


PRT 


DRIuescriDtTT SK- 


999 


181-10-1 -0-D10-PS 


PRT 


DBluescriDtll SK- 

UUJUVJVl lull! M- 


993 


1 81-16-1 -0-G7-PS 


PRT 


DBluescriDtll SK- 


994 


181-1 6-2-0- A7-CS 


PRT 


DBluescriDtll SK- ' 


1 99S 


1 8 1 -20-3-0-B5-PS 


PRT 


DBluescriDtll SK- 


i 296 

w 


181-3-3-0-B8-CS 


PRT 


DBluescriDtll SK- 


297 


1 81-3-3-0-P9-CS 


PRT 


DBluescriDtll SK- 


298 


1 82-1-2-0-D12-CS 


PRT 


pBluescriptll SK- 


299 


184-1-4-0-Cl 1-CS 


PRT 


pBluescriptll SK- 


300 


1 84-4- 1-0-A1 1-CS 


PRT 


pBluescriptll SK- 


i 301 


1 87-12-4-0-A8-CS 


PRT 


pBluescriptll SK- 


302 


187-2-2-0-A3-PS 


PRT 


pBluescriptll SK- 


303 


1 87-3 1 -0-0-fl 2-PS 


PRT 

I XX X 


DBluescriDtTI SK- 

UlV 1 UvOvl 1 U 111 1^ 


304 


1 87-34-0-0-1 12-CS 


PRT 


pBluescriptll SK- 


305 


187-37-0-0-clO-CS 


PRT 


pBluescriptll SK- 


306 


1 87-38-0-0-1 10-CS 


PRT 


pBluescriptll SK- 


307 


187-39-0-0-kl2-CS 


PRT 


pBluescriptll SK- 


308 


1 87-4 l-0-0-i2 1-CS 


PRT 


pBluescriptll SK- 



482 



BNSDOCID: <WO_0142451A2_I_> 



WO 01/42451 



PCT/1B00/01938 



309 


188-1 1-1 -0-B3-CS 


PRT 


pBluescriptll SK- 


310 


188-18-4-0-A9-CS 


PRT 


pBluescriptll SK- 


! 311 


188-28-4-0-B12-CS.cor 


PRT 


pBluescriptll SK- 


312 


188-28-4-0-B12-CS.fr 


PRT 


pBluescriptll SK- 


313 


188-28-4-0-D4-CS 


PRT 


pBluescriptll SK- 


314 


188-41-l-0-B8-CS.cor 


PRT 


pBluescriptll SK- 


315 


188-41-l-0-B8-CS.fr 


PRT 


pBluescriptll SK- 


316 


1 88-45- 1-0-D9-CS 


PRT 


pBluescriptll SK- 


317 


188-9-2-0-E1-CS 


PRT 


pBluescriptll SK- 


318 


105-079-3-0-A11-CS 


PRT 


pBluescriptll SK- 


i 319 


1 05-092-1 -0-H7-CS 


PRT 


pBluescriptll SK- 


320 


1 05-14 1-4-0-H9-CS 


PRT 


pBluescriptll SK- 


S 321 


109-013-1-0-B9-CS 


PRT 


pBluescriptll SK- 


I 322 


110-008-4-0-D9-CS 


PRT 


pBluescriptll SK- 


I 323 


1 14-001 -3-0-A2-CS 


PRT 


pBluescriptll SK- 


324 


11 4-028-2 -0-C1-CS 


PRT 


pBluescriptll SK- 


325 


11 4-032-1 -0-H10-CS 


PRT 


pBluescriptll SK- 


326 


114-043-2-0-A10-CS 


PRT 


pBluescriptll SK- 


327 


114-044-1 -0-C5-CS 


PRT 


pBluescriptll SK- 


| 328 


116-003-3-0-D10-CS 


PRT 


pBluescriptll SK- 


329 


116-003-3-0-G12-CS 


PRT 


pBluescriptll SK- 


330 


1 16-01 1-2-0-F11-CS 


PRT 


pBluescriptll SK- 


331 


116-033-3-0-E4-CS 


PRT 


pBluescriptll SK- 


332 


116-041-4-0-B6-CS 


PRT 


pBluescriptll SK- 


333 


11 6-044-2 -0-C4-CS 


PRT 


pBluescriptll SK- 


334 


11 6-075- 1-0-E6-CS 


PRT 


pBluescriptll SK- 


335 


116-094-4-0-G5-CS 


PRT 


pBluescriptll SK- 


336 


117-005-3-0-F2-CS 


PRT 


pBluescriptn SK- 


337 


121-007-3 -0-D9-CS 


PRT 


pBluescriptll SK- 


338 


145-91-3-0-D10-CS 


PRT 


pBluescriptll SK- 


339 


157-1 7-1 -0-F4-CS 


PRT 


pBluescriptll SK- 


340 


160-1 1-3-0-G8-CS 


PRT 


pBluescriptll SK- 


341 


1 60-24-1 -0-F12-CS 


PRT 


pBluescriptll SK- 


342 


1 60-24-2 -0-E9-CS 


PRT 


pBluescriptll SK- 


343 


160-25-4-0-D2-CS 


PRT 


pBluescriptll SK- 


344 


160-31-3-0-A11-CS 


PRT 


pBluescriptll SK- 


345 


1 60-32-1 -0-F6-CS 


PRT 


pBluescriptll SK- 


346 


1 60-37-1 -0-A3-CS 


| PRT 


pBluescriptll SK- 


347 


1 60-40-3 -0-E9-CS 


PRT 


pBluescriptll SK- 


348 


160-58-3-0-E4-CS 


PRT 


pBluescriptll SK- 


349 


160-85 -3 -0-D4-CS 


PRT 


pBluescriptll SK- 


350 


160-95-3-0-A11-CS 


PRT 


pBluescriptll SK- \ 


351 


162-1 0-4 -0-F9-CS.cor 


PRT 


pBluescriptll SK- j 


352 


162-10^-0-F9-CS.fr 


PRT 


pBluescriptll SK- 


353 


174-1 3-2 -0-E4-CS 


PRT 


pPT 


354 


174-46-2-0-B11-CS 


PRT 


pPT 


355 


179-8-2-0-A6-CS 


PRT 


pBluescriptll SK- 


356 


180-22-3-0-B6-CS 


PRT 


pBluescriptll SK- 


357 


L 181 -13-1 -0-F7-CS 


PRT 


pBluescriptll SK- 


; 358 


181-15-4-0-F7-CS 


PRT 


pBluescriptll SK- 


I 359 


1 81-20-1 -0-G7-CS 


PRT 


pBluescriptll SK- 


360 


184-15-3-0-D1-CS 


PRT 


pBluescriptll SK- 



483 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 



161 


1R7-12-2-0-G1 1-CS 

1 O / A ^ ^ V VJ 1 1 V/ k_» 


PRT 


DBluescriotll SK- 


169 


1 87-2-2-0-A 1 2 -PS 

1 O / *m L V 1 V-«0 


PRT 


oBluescriotll SK- 


161 


1 R7-10-0-0-k21-PS 


PRT 


nBluescriDtll SK- 


_J V*T 


187-36-0-0-el9-CS 


PRT 


DBluescriotll SK- 


i 16S 

J VJ 


1 87-38-0-0-d22-CS 

1 U / JO V V U&Ar V^O 


PRT 


DBluescriotll SK- 


166 
j v v 


1 87-39-0-0-b9-CS 


PRT 


DBluescriotll SK- 


167 


187-39-0-0-e6-CS 


PRT 


pBluescriptll SK- 


168 


1 87-45-0-0-1 1 8-CS 

JO/ I J V/ V I 1 O V^iJ 


PRT 


oBluescriDtll SK- 

yj ±S M v Jv A 1 L/ 111 V 


169 


1 87-45-0-0-m21-PS 


PRT 


oBluescriotll SK- 

UJL* JUVUVl It/ill l_^AV- 


170 


1 87-4S-0-0-nR-PS 

1 O / 1 J V V HO O 


PRT 


oBluescriotll SK- 


! 171 

J / I 


1 87-46-0-0-f21-PS 


PRT 


DBluescriotll SK- 


17? 


1 87_S-1 -0-A 1 2 -PS 

1 O / J I V /A 1 X. V^O 


PRT 


oBluescriotll SK- 

L/l^lUvtJvl 11/111 UJV 


1 171 

J / J 


1 87-S-1 -0-F6-PS 


PRT 


nRliiPscrintll SK- 


174 

J /H 


1 87-S-9-0-R9-PS 

1 O / J Z. V -V^O 


PRT 


TVRlue^criotll SK- 


17S 


1 87-S-1-0-DS-PS 

I O / J J V1-/J V»»0 


PRT 


oBluescriotll SK- 


i 176 


1 R7-S 1 -0-0-f0-PS 
i o / j i v v i y~\siD 


PRT 

± rv 1 


nRluescrintll SK- 


177 


1 87 6.1 0-R9-PS 


PRT 


nRliie^rrintTT SK- 


i 178 


1 R7-6-4-0-P1 0-PS 


PRT 

1 rv 1 


nB1iie<;rrintTI SK- 


17Q 


188-1 0-9-0-PR-PS 


PRT 

x XV 1 


nRlue^rrintTT SK- 


joU 


1 88 99_4 0 rjfi.PQ 
1 O O -ZZ -U - VJ O- Vx. o 


PRT 
1 rv 1 


nB1np«:rrintll 9K- 


^81 

JOl 


1 R8-98_4-0-Fj1 1 -PS 

1 O O O — H-V/ !_/ 1 1 V^O 


PRT 
1 rv 1 


nBlnpc:rrintlT SK- 


189 


188 90 1 -O-F 1 O-PS 

1 O0-Z7- 1 -U-JC. 1 v-v^o 


PRT 

I XV 1 


r>R1np«;rrintlT SK- ' 


^81 


188 14-4 0 FS PS 


PRT 

X XV X 


nB1n<*<:rrintlT SK- ' 


^84 


1 88 0 l-O- A S-PS 


PRT 
1 rv x 


nBlnpQrrintTl SK- « 


**8^ 


lO^ 09 1-1-0 PI-PS 


PRT 

I XV X 


nRlnp<;rrinlTl SK- 


J oO 


10^ 017-4 O 1419 PS 


PRT 

1 XV X 


JJXJlUCovl 1JJLXX orv 


^87 
JO / 


10^ H71-9-0-A7-PS 


PRT 

X XV X 


r»Rliip<;rrintlT SK- 


^88 


1 OQ 009-4 0 P6-PS 


PRT 

x XV X 


nRliiPQrrintll SK - 
|Jxj j utoti 1 L^ixx orv- 


IRQ 

Joy 


1 no noi 1 0 P4 PS 


PRT 
I rv 1 


nR1np<:rnntll SK - 

pDlUtoU 1LJLXX orv- 


^oo 


116 1 1 8 4 O A8 PS 


PRT 

X XV 1 


nRlnpQrrintlT SK - 

yJxD 1 UCgLl 1 jJ LXX O XV 


J7l 


14^ ^9 9 O F>19-PS 


PRT 

I XV 1 


■nRlnp^rrintlT SK- 

LJXJ 1 liCSLl 1 yj LXX OXV 


^Q9 


14^ 790 PS 


PRT 

X XV X 


nR1iip<irrintll SK- 
pjjiutoii i^jixx orv— 




1 4S-7-l-0-Dl.PS 


PRT 

I XV X 


nBliiPQrrintTl SK- 

L7X_> lUCoLI 1LJLX1 OXV 




1 S7_1 7.9-0-P1 -PS 


PRT 

i XV X 


nB1iiP«irrintTl SK- 


J/J 


160 101 1 0-T49-PS 


PRT 

X XV X 


nR1iip<;rrintTl SK- 
\Jxj lULoLi iiyixx orv 


106 


1 60- 1 9- 1 -0-D 1 0-PS 

1 \)\J 1 z - 1 "U iy 1 V V^O 


i PRT 

XT XV X 


nRluescrintTl SK- 


107 

j^ / 


1 60-28-4-0-P4-PS 

1 vv .^.o^t-v v^*t v^o 


PRT 


oBluescriotll SK- 


108 


1 60-1 1 -1-0-F4-PS 

1 Uv J 1 — J— V X-**"T — V^O 


PRT 

X XV X 


oBluescriotll SK- 


100 


1 60-40-1 -0-H4-PS 


PRT 


oBluescriotll SK- { -- 




1 60-S4-1 -0-F7-PS 


PRT 


oBluescriotll SK- 

U1J 1 ULOvl 1 yJ 111 L>iv 


401 


1 60-88-1-0-A8-PS cor 


PRT 


oBluescriotTT SK- 


40? 


160-88-1-0-A8-PS fr 


PRT 


oBluescriotll SK- 


401 

■"TV J 


1 60-99-4-0-E4-CS 


PRT 


pBluescriptll SK- 


404 


161-5-4-0-B6-CS 


PRT 


pBluescriptll SK- 


40S 


1 74-1 7_1 -0-D6-PS 


PRT 

X XV X 


dPT 


406 


1 74-19-4-0-F8-PS 

1 /*-r— J^"*T"V X O V-rO 


PRT 

X XV X 


dPT 


407 

*tv / 


1 74-18-4-0-O1 1 -PS 


PRT 

X XV X 


oPT i 
pr 1 


408 


1 74-8-2 -0-C10-CS 


PRT 


pPT ! 


409 


179-14-2-0-F11-CS 


PRT 


pBluescriptll SK- 


410 


I79.9-4.O-B8-CS 


PRT 


pBluescriptll SK- 


411 


181-1 0-1 -0-C9-CS 


PRT 


pBluescriptll SK- 


412 


! 187-5-3-0-C7-CS 


PRT 


pBluescriptll SK- 
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413 


188-26-4-0-F5-CS 


PRT 


pBluescriptU SK- 


414 | 


188-27-3-0-G1-CS 


PRT 


pBluescriptll SK- 


415 


188-29-2-0-H1-CS 


PRT 


pBluescriptll SK- 


I 416 


188-31-1-0-E6-CS 


PRT 


pBluescriptU SK- j 


417 


188-45-1-0-D3-CS 


PRT i 


pBluescriptU SK- 


418 


1 88-5-1 -0-H6-CS 


PRT [ 


pBluescriptU SK- 


419 


188-9-1-0-ClO-CS 


PRT 


pBluescriptll SK- 


420 


105-01 6-3-0-C5-CS 


PRT 


pBluescriptll SK- 


421 


105-026-4-0-D9-CS 


PRT 


pBluescriptll SK- 


422 


105-053-2-0-D9-CS 


PRT 


pBluescriptll SK- 


423 


105-069-3-0-A11-CS 


PRT 


pBluescriptll SK- 


424 


1 05-076-4 -0-F6-CS 


PRT 


pBluescriptll SK- 


425 


105-135-2-0-F9-CS 


PRT 


pBluescriptll SK- 


426 


106-023-4-0-F6-CS 


PRT 


pBluescriptU SK- 


427 


1 10-001 -3 -0-C11-CS 


PRT 


pBluescriptll SK- 


428 


110-002-3-0-F9-CS 


PRT 


pBluescriptll SK- 


429 


114-019-3-0-D9-CS 


PRT 


pBluescriptll SK- 


430 


1 14-029-1 -0-C6-CS 


PRT 


pBluescriptll SK- 


431 


114-032-4-0-B1-CS 


PRT 


pBluescriptll SK- 


432 


11 4-070-2 -0-H4-CS 


PRT 


pBluescriptll SK- 


433 


1 1 6-01 6-3 -0-F11-CS 


PRT I 


pBluescriptU SK- 


434 


116-022-4-0-G2-CS 


PRT 


pBluescriptll SK- 


435 


1 16-052-2-0-H8-CS 


PRT 


pBluescriptll SK- 


436 


11 6-053 -4-0-B4-CS 


PRT 


pBluescriptll SK- 


437 


11 6-094-3 -0-H2-CS 


PRT 


pBluescriptU SK- 


438 


116-112-4-0-C7-CS 


PRT 


pBluescriptU SK- 


439 


116-123-3-0-F12-CS 


PRT 


pBluescriptll SK- 


440 


1 23-008-1 -0-C5-CS 


PRT 


pBluescriptll SK- 


441 


1 45-53-2 -0-H8-CS 


PRT 


pBluescriptll SK- 


442 


145-57-2-0-C9-CS.cor 


PRT 


pBluescriptll SK- 


443 


145-57-2-0-C9-CS.fr 


PRT 


pBluescriptU SK- 


444 


145-7-3-0-B12-CS 


PRT 


pBluescriptll SK- 


445 


157-12-2-0-D1-CS 


PRT 


pBluescriptU SK- 


446 


157-16-2-0-D5-CS 


PRT 


pBluescriptll SK- 


447 


157-1 8-2 -0-A7-CS 


PRT 


pBluescriptll SK- 


448 


160-1 03-1 -0-B10-CS 


PRT 


pBluescriptll SK- 


449 


160-104-4-0-F3-CS 


PRT 


pBluescriptll SK- 


450 


1 60-22-2 -0-D10-CS 


PRT 


pBluescriptll SK- 


451 


160-24-3-0-F12-CS 


PRT 


pBluescriptU SK- 


i 452 


160-3-2-0-H3-CS 


PRT 


pBluescriptll SK- 


| 453 


160-5 8-2 -0-A2-CS 


PRT 


pBluescriptll SK- 


! 454 


160-73-1 -0-B4-CS 


PRT 


pBluescriptll SK- 


j 455 


160-75-4-0-E6-CS 


PRT 


pBluescriptll SK- 


456 


160-97-3-0-E9-CS 


PRT 


pBluescriptU SK- 


457 


174-1-4-0-E9-CS 


PRT 


pPT 


458 


174-12-4-0-C2-CS 


PRT 


_pPT 


459 


180-19-4-0-H2-CS 


PRT 


pBluescriptU SK- 


460 


181-10-4-0-G12-CS 


PRT 


pBluescriptll SK- 


461 


181-3-2-0-F6-CS 


PRT 


pBluescriptll SK- 


462 


181-4-4-0-A12-CS 


PRT 


pBluescriptll SK- 


463 


181-9-2-0-F12-CS.cor 


PRT 


pBluescriptll SK- 


464 


181-9-2-0-F12-CS.fr 


PRT 


i pBluescriptU SK- 
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S 465 


184-13-3-0-E11-CS 


PRT 


pBluescriptn SK- 


! 466 


184-4-2-0-D3-CS 


PRT 


pBluescriptll SK- 


467 


1 84-7-1 -0-E7-CS 


PRT 


pBluescriptll SK- 


468 


184-8-4-0-G9-CS 


PRT 


pBluescriptll SK- 


469 


187-10-3-0-G9-CS 


PRT 


pBluescriptll SK- 


470 


187-32-0-0-m20-CS 


PRT 


pBluescriptll SK- 


471 


1 87-32-0-0-n2 1 -CS.cor 


PRT 


pBluescriptll SK- 


472 


187-32-0-0-n21-CS.fr 


PRT 


pBluescriptll SK- 


473 


187-4-2-0-E6-CS 


PRT 


pBluescriptll SK- 


474 


187-40-0-0-U5-CS 


PRT 


pBluescriptll SK- 


475 


187-47-0-0-e24-CS 


PRT 


pBluescriptll SK- 


476 


187-9-3-0-A2-CS 


PRT 


pBluescriptll SK- 


477 


188-26-4-0-H1-CS 


PRT 


pBluescriptll SK- 


478 


188-35-3-0-G9-CS 


PRT 


pBluescriptll SK- 


479 


188-38-4-0-D8-CS 


PRT 


pBluescriptll SK- 


480 


188-41-1-0-E6-CS 


PRT 


pBluescriptll SK- 


481 


188-42-2-0-F3-CS.cor 


PRT 


pBluescriptll SK- 


482 


188-42-2-0-F3-CS.fr 


PRT 


pBluescriptll SK- 
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Table II 





jr uji lutjing 
sequence 


sequence 


for mature 
protein 


a] vo H ^ n \; 1 q 1 1 n n 

r uiy auciiy Jdiiuii 
signal 


TPrkl \j *» rl £*> n vl a ♦ S rt n 
JT UI j <1UCII j Im I III 11 

site 


1 


[169-1692] 


[169-249] 


[250-1692] 


[2126-2131] 


[2152-2201] 


2 


[148-1140] 


[148-240] 


[241-1140] 


[1592-1597] 


[1615-1631] 


3 


[85-906] 


[85-135] 


[136-906] 


[1159-1164] 


[1184-1245] 


4 


[31-1248] 


[31-135] 


[136-1248] 


None detected 


[1607-1623] 


5 


[72-143] 

L J 


[72-119] 


[120-143] 


[1416-1421] 


[1438-1454] 


6 


[111-1154] 


[111-197] 


[198-1154] 


[1602-1607] 


[1623-1639] 


7 


[66-1256] 


[66-173] 


[174-1256] 


None detected 


[1752-1768] | 


g 


[190-1398] 


[190-252] 


[253-1398] 


[1470-1475] 


[1494-1510] 


9 


[78-410] 


[78-1551 


[156-410] 


None detected 


[866-882] 


10 


[84-299] 


[84-1341 

L J 


[135-299] 


[1814-1819] 


[1833-1849] 


11 


[55-4681 


[55-991 


[100-468] 


[531-536] 


[549-565] 


12 


[152-475] 


[152-244] 


[245-475] 


[1623-1628] 


[1647-1663] 


13 


[112-552] 


[112-183] 


[184-552] 


[706-711] 


[729-744] 


14 


[101-1243] 


[101-199] 


[200-1243] 


[1720-1725] 


[1745-1759] 


15 


[101-517] 


[101-199] 


[200-517] 


[1716-1721] 


[1741-1755] 


16 


[59-8531 


[59-100] 


[101-853] 


[894-899] 


[922-936] 


17 


[73-672] 


[73-132] 


[133-672] 


[689-694] 


[711-747] 


18 


[94-1275] 


[94-210] 


[211-1275] 


[1849-1854] 


[1870-1884] | 


19 


[42-5151 


[42-921 


[93-5151 


[649-654] 


[677-691] ! 


20 


[271-969] 


[271-366] 


[367-969] 


[1093-1098] 


[1124-1138] 


21 


[76-276] 


[76-135] 


[136-276] 


[436-441] 


[455-468] 


22 


[6-287] 


[6-80] 


[81-287] 


[684-689] 


[706-720] 


23 


[171-692] 


[171-227] 


[228-692] 


[691-696] 


[713-727] 


24 


[137-454] 


[137-187] 


[188-454] 


[440-445] 


[456-470] 


25 


[238-609] 


[238-291] 


[292-609] 


[948-953] 

L J 


[973-987] 


26 


[80-862] 


[80-127] 

L J 


[128-862] 


[875-880] 


[894-908] 


27 


[83-310] 

L J 


[83-157] 


[158-310] 


[725-730] 


[748-762] 


28 


[310-906] 


[310-357] 


[358-906] 


[1071-1076] 


[1088-1102] 


29 


[24-287] 


[24-131] 


[132-287] 


[405-410] 


[422-436] 


30 


[132-1574] 


[132-206] 


[207-1574] 


[1899-1904] 


[1923-1938] 


31 


[117-545] 


[117-245] 


[246-545] 


None detected 


[1100-1116] 


32 


[117-362] 


none detected 


[117-362] 


None detected 


[1098-1114] 


33 


[144-1262] 


[144-224] 


[225-1262] 


[2035-2040] 


[2056-2072] 


34 


[35-316] 


[35-109] 


[110-316] 


None detected 


[393-409] ; 


35 


[177-767] 


[177-236] 


[237-767] 


None detected 


[822-836] j 


| 36 


[208-1239] 


[208-294] 


[295-1239] 


None detected 


[1307-1323] 


37 


[60-1682] 


[60-143] 


[144-1682] 


None detected 


[1929-1945] 


38 


[198-998] 


[198-269] 


[270-998] 


[1292-1297] 


[1315-1330] 


39 


[505-1590] 


[505-624] 


[625-1590] 


[2089-2094] 


[2108-2124] 


40 


[84-326] 


[84-146] 


[147-326] 


[1122-1127] 


[1142-1159] | 


41 


[56-1678] 


[56-139] 


[140-1678] 


None detected 


[1936-1953] J 
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42 


[1 19-1522] 


[119-181] 


[182-1522] 


None detected 


r -ft -ft f y«-ioi 

[1671-1688] 


43 


[334-1551] 


[334-426] 


[427-1551] 


None detected 


[1925-1942] 


44 


[72-986] 


[72-149] 


[150-986] 


[1608-1613] 


[1640-1657] 


45 


[157-1482] 


[157-219] 


[220-1482] 


None detected 


[1716-1733] 


46 


[195-1052] 


[195-338] 


[339-1052] 


None detected 


[1854-1871] 


47 


[217-1410] 


[217-279] 


[280-1410] 


[1482-1487] 


[1507-1536] 


48 


[103-492] 


[103-162] 


[163-492] 


[794-799] 


[815-832] 


49 


[234-491] 


[234-293] 


[294-491] 


[793-798] 


[814-831] 


50 


[180-800] 


[180-248] 


[249-800] 


[880-885] 


[901-917] 


51 ! 


[140-472] 


[140-211] 


[212-472] 


None detected 


[605-621] 


52 ! 


[68-484] 


[68-112] 


[113-484] 


None detected 


[657-673] 


53 


[38-517] 


[38-118] 


[119-517] 


[861-866] 


[885-897] 


54 


[92-634] 


[92-139] 


[140-634] 


None detected 


None detected 


55 


[27-767] 


[27-80] 


[81-767] 


None detected 


[1031-1047] 


56 


[4-399] 


[4-126] 


[127-399] 


[891-896] 


[909-923] 


57 


[127-879] 


[127-198] 


[199-879] 


None detected 


[1224-1240] 


58 


[156-566] 


[156-221] 


[222-566] 


[870-875] 


[888-902] 


59 


[35-1657] 


[35-118] 


[119-1657] 


None detected 


[1955-1969] 


60 


[77-937] 


[77-127] 


[128-937] 


[1098-1103] 


[1116-1132] 


61 


[9-503] 


[9-113] 


[114-503] 


[594-599] 


[615-631] 


62 


[21-464] 


[21-95] 


[96-464] 


[650-655] 


[692-722] 


63 


[178-1050] 


[178-279] 


[280-1050] 


[1400-1405] 


[1426-1442] 


64 


[32-274] 


[32-178] 


[179-274] 


[756-761] 


[779-795] 


65 


[222-920] 


[222-311] 


[312-920] 


[1191-1196] 


[1220-1236] 


66 


[101-355] 


[101-160] 


[161-355] 


[772-777] 


[788-881] 


67 


[173^87] 


[173-301] 


[302-487] 


[486-491] 


[508-524] 


68 


[210-1082] 


[210-311] 


[312-1082] 


[1432-1437] 


[1456-1472] 


69 


[172-1449] 


[172-255] 


[256-1449] 


None detected 


[1721-1737] 


70 


[30-1427] 


[30-77] 


[78-1427] 


[1594-1599] 


[1621-1637] 


71 


[30-1175] 


[30-77] 


[78-1175] 


[1593-1598] 


[1620-1636] 


72 


[66-839] 


[66-173] 


[174-839] 


None detected 


[1742-1758] 


73 


[64-903] 


[64-162] 


[163-903] 


[1612-1617] 


[1631-1647] 


74 


[64-585] 


[64-162] 


[163-585] 


[1611-1616] 


[1630-1646] 


! 75 


[274-753] 


[274-324] 


[325-753] 


[1931-1936] 


[1947-1963] 


76 


[191-1468] 


[191-274] 


[275-1468] 


None detected 


[1741-1757] 


77 


[48-950] 


[48-107] 


[108-950] 


[1983-1988] 


[201 1-2027] 


78 


[156-512] 


[156-206] 


[207-512] 


[1831-1836] 


[1864-1880] 


79 


[67-351] 


[67-183] 


[184-351] 


None detected 


[568-584] 


80 


[259-831] 


[259-375] 


[376-831] 


None detected 


[1337-1351] 


81 


[111-377] 


[111-233] 


[234-377] 


[689-694] 


[706-720] ! 


82 


[223-432] 


; [223-336] 


[337-432] 


[986-991] 


[1015-1029] 


83 


[769-1272] 


[769-843] 


[844-1272] 


None detected 


[1774-1788] 


84 


[30-527] 


[30-74] 


[75-527] 


[738-743] 


[756-805J 


85 


[39-506] 


1 [39-83] 


[84-506] 


None detected 


[800-814] 


86 


[115-429] 


! [115-210] 


[211-429] 


[565-570] 


[584-598] i 


87 


[332-574] 


[332-412] 


[413-574] 


None detected 


[630-699] 
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88 


[133-417] 


[133-213] 


[214-417] 


[876-881] 


[891-905] 


89 


[1 13-364] 


[1 13-172] 


[173-364] 


None detected 


[500-514] 


90 


[9-380] 


[9-104] 


[105-380] 


[483-488] 


[504-518] 


91 


[155-340] 


[155-292] 


[293-340] 


[728-733] 


[754-808] 


92 


[185-634] 


[185-253] 


[254-634] 


[704-709] 


[723-737] 


93 


[53-646] 


[53-91] 


[92-646] 


[694-699] 


[714-728] 


94 


[247-510] 


[247-318] 


[319-510] 


[544-549] 


[568-582] 


95 


[143-592] 


[143-277] 


[278-592] 


[l 877-1882] 


[1898-1913] 


96 


[33-458] 


[33-89] 


[90-458] 


[637-642] 


[654-670] 


97 


[1-336] 


[1-81] 


[82-336] 


[900-905] 


[923-939] 


98 


[174-443] 


[174-269] 


[270-443] 


[629-634] 


[647-661] 


99 


[282-521] 


[282-386] 


[387-521] 


[600-605] 


[631-647] 


100 | 


[251-643] 


[251-295] 


[296-643] 


None detected 


[990-1006] 


101 


[179-475] 


[179-295] 


[296-475] 


[995-1000] 


[1015-1059] 


102 


[34-327] 


[34-162] 


[163-327] 


[466-471] 


[498-514] 


103 


[303-953] 


[303-359] 


[360-953] 


[1124-1129] 


[1142-1158] 


104 


[97-645] 


[97-156] 


[157-645] 


[1524-1529] 


[1547-1563] 


105 


[80-820] 


[80-118] 


[119-820] 


[1587-1592] 


[1606-1621] 


106 


[77-388] 


[77-217] 


[218-388] 


[524-529] 


[541-557] 


107 


[139-513] 


[139-201] 


[202-513] 


[566-571] 


[584-600] 


108 


[81-986] 


[81-134] 


[135-986] 


[1092-1097] 


[1113-1129] 


109 


[266-586] 


[266-307] 


[308-586] 


[745-750] 


[762-778] 


110 


[59-745] 


[59-160] 


[161-745] 


None detected 


[1285-1301] 


111 


[59-676] 


[59-160] 


[161-676] 


None detected 


[1284-1300] 


112 


[15-278] 


[15-146] 


[147-278] 


[1580-1585] 


[1600-1617] 


113 


[167-619] 


[167-262] 


[263-619] 


[1598-1603] 


[1617-1634] 


114 


[223-417] 


[223-270] 


[271-417] 


[655-660] 


[677-693] 


115 


[166-732] 


[166-237] 


[238-732] 


[753-758] 


[768-784] 


116 


[75-623] 


[75-215] 


[216-623] 


[767-772] 


[788-804] 


117 


[30-335] 


[30-71] 


[72-335] 


[450-455] 


[468-484] 


118 


[21-752] 


[21-107] 


[108-752] 


None detected 


[970-985] 


119 


[185-715] 


[185-253] 


[254-715] 


[785-790] 


[814-839] 


120 


[54-527] 


[54-116] 


[117-527] 


[545-550] 


[567-583] 


121 


[129-686] 


[129-185] 


[186-686] 


[989-994] 


[1008-1024] 


122 


[165-614] 


[165-305] 


[306-614] 


[719-724] 


[744-760] 


123 


[192-476] 


[192-326] 


[327^76] 


[555-560] 


[578-594] 


124 


[16-297] 


[16-93] 


[94-297] 


None detected 


[543-559] 


125 


[216-635] 


[216-335] 


[336-635] 


[717-722] 


T ^\ Ci *1 A At 

[728-744] 


I 126 


[164-280] 


[164-268] 


[269-280] 


[789-794] 


[809-824] 


127 


[68-301] 


[68-190] 


[191-301] 


[485-490] 


[510-526] 


128 


[179-427] 


[179-298] 


[299-427] 


[579-584] 


[602-618] 


129 


[22-297] 


[22-66] 


[67-297] 


[742-747] 


[760-776] 


130 


[9-845] 


[9-134] 


[135-845] 


r r\ s~ a r\ S" r\~\ 

[964-969] 


[983-998] 


131 


[27-578] 


[27-119] 


[120-578] 


[742-747] 


[763-779] 


132 


[408-710] 


[408-533] 


[534-710] 


[985-990] 


[1009-1025] 


133 


[247-501] 


[247-306] 


[307-501] 


None detected 


[592-607] 
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134 


[333-602] 


[333-416] 


[417-602] 


None detected 


r mm jr 1 ^ A 1 

[761-774] 


135 


[110-376] 


[1 10-208] 


[209-376] 


[582-587] 


[601-611] 


136 


[22-417] 


[22-66] 


[67-417] 


[888-893] 


[909-925] 


137 


[62-367] 


[62-103] 


[104-367] 


[638-643] 


[658-674] 


138 


[107-1618] 


[107-178] 


[179-1618] 


[1688-1693] 


[1709-1725] 


139 


[16-471] 


[16-93] 


[94-471] 


None detected 


[1458-1474] 


140 


[222-374] 


[222-299] 


[300-374] 


None detected 


[637-653] 


141 


[59-274] 


[59-127] 


[128-274] 


[1452-1457] 


[1474-1490] 


142 


[158-442] 


[158-301] 


[302-442] 


[621-626] 


[645-661] 


143 


[5-454] 


[5-64] 


[65-454] 


[1745-1750] 


[1773-1789] 


144 


[241-1302] 


none detected 


[241-1302] 


[1968-1973] 


[1990-2006] 


145 


[15-635] 


none detected 


[15-635] 


[1057-1062] 


[1080-1096] 


146 


[109-738] 


none detected 


[109-738] 


[1633-1638] 


[1650-1666] 


147 


[21-1145] 


none detected 


[21-1145] 


[1648-1653] 


[1666-1687] 


148 


[70-1596] 


none detected 


[70-1596] 


[1712-1717] 


[1733-1747] 


149 


[129-362] 


none detected 


[129-362] 


[597-602] 


[626-658] 


150 


[109-594] 


none detected 


[109-594] 


[1999-2004] 


[2029-2045] 


151 


[150-587] 


none detected 


[150-587] 


None detected 


[772-788] 


152 


[173-847] 


none detected 


[173-847] 


[1894-1899] 


[1915-1931] 


153 


[100-441] 


none detected 


[100-441] 


[479-484] 


[500-514] 


154 


[32-1132] 


none detected 


[32-1132] _^ 


None detected 


[1167-1183] 


155 


[160-996] 


none detected 


[160-996] 


[1504-1509] 


[1529-1545] 


156 


[11-529] 


none detected 


[11-529] 


[1042-1047] 


[1053-1068] 


157 


[135-749] 


none detected 


[135-749] 


[1055-1060] 


[1081-1097] 


158 


[98-637] 


none detected 


[98-637] 


[862-867] 


[878-894] 


159 


[221-670] 


none detected 


[221-670] 


[669-674] 


[688-703] 


160 


[165-674] 


none detected 


[165-674] 


[808-813] 


[833-849] 


161 


[165-671] 


none detected 


[165-671] 


[805-810] 


[830-846] 


162 


[28-1128] 


none detected 


[28-1128] 


[1121-1126] 


[1159-1176] 


163 


[135-194] 


none detected 


[135-194] 


[1050-1055] 


[1068-1084] 


164 


[173-847] 


none detected 


[173-847] 


[1757-1762] 


[1776-1793] 


165 


[8-1 141] 


none detected 


[8-1141] 


None detected 


[1832-1849] 


166 


[136-264] 


none detected 


[136-264] 


[1720-1725] 


[1731-1748] 


j 167 


[14-1048] 


none detected 


[14-1048] 


[1234-1239] 


[1258-1275] 


i 168 


[70-777] 


none detected 


[70-777] 


[987-992] 


[1007-1023] 


169 


[38-400] 


none detected 


[38-400] 


[1043-1048] 


[1069-1085] 


j 170 


[63-572] 


none detected 


j [63-572] 


[750-755] 


r mm ^ mm mm mm ^ "\ 

[161-116] | 


171 


[160-867] 


none detected 


[160-867] 


[1178-1183] 


[1203-1219] 


172 


[68-640] 


none detected 


[68-640] 


None detected 


[1471-1487] 


173 


[132-1298] 


none detected 


[132-1298] 


[1873-1878] 


[1899-1915] 


174 


[259-1701] 


none detected 


[259-1701] 


None detected 


[1974-1990] 


175 


[213-1274] 


none detected 


[213-1274] 


[1940-1945] 


[1955-1971] 


176 


[68-127] 


none detected 


[68-127] 


None detected 


[1597-1613] 


177 


[65-1024] 


none detected 


[65-1024] 


[1291-1296] 


[1315-1361] 


178 


[109-585] 


none detected 


[109-585] 


[1059-1064] 


[1082-1113] 


179 


[29-577] 


none detected 


[29-577] 


[1917-1922] 


[1944-1960] 
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180 


[23-451] 


none detected 


[23-451] 


r i a ac i yi 1 ai 

[1 405-1410] 


r 1 a 1 A A 1 1 

[1427-1 443] 


181 


[232-450J 


none detected 


[232-450J 


None detected 


re OA 

[5o9-oUjJ 


182 


n c o lion 

[758-1 1 83] 


none detected 


n co i i on 

[758-1 183J 


None detected 


[l/Oo-l /24J 


183 


[486-932] 


none detected 


r ao n mn 

[486-932J 


None detected 


[1670-loooJ 


184 


r on o a /i l 

[80-304J 


none detected 


[80-304] 


None detected 


X A CI AC1~\ 

[432-4o3J 


185 


rioo ^rti i 

[188-691] 


none detected 


rioo /r a 1 1 

[188-691] 


[707-712] 


[727-773J 


186 


1" A /I C T1 "I 

[94-573] 


none detected 


[94-573] 


None detected 


[739-753J 


187 


Tint ^i^^T 

[181-462] 


none detected 


r i o i a s~ 

[181-462] 


None detected 


n j! A /II 

[740-754] 


I 188 


[6-290] 


none detected 


[6-290] 


None detected 


TAT1 AAOl 

[971-998J 


189 


[1 15-411] 


none detected 


[115-41 1] 


[573-578] 


rcAi /"nn 

[59 1-605 J 


190 


[3-368] 


none detected 


[3-368] 


[481-486] 


[5 1 1 -526] 


191 


[174-527] 


J A. _ -A. _ J 

none detected 


[174-527] 


ro"70 oon 

[878-883] 


font A 1 AT 

[896-910] 


192 


[57-203] 


none detected 


[57-203] 


[579-584] 


[599-668] 


193 


[68-334] 


none detected 


[68-334] 


[562-567] 


[583-637] 


194 


[183-443] 


none detected 


[183-443] 


[670-675] 


[692-706] 


195 


[94-228] 


none detected 


[94-228] 


None detected 


[656-670] 


196 


[133-327] 


none detected 


[133-327] 


[465-470] 


r i* AX* CI AT 

[496-510] 


197 


[22-357] 


none detected 


[22-357] 


None detected 


r j n p aai 

[486-500] 


198 


[4-333] 


none detected 


[4-333] 


r /T< /a An 

[633-638] 


rz"c*> /■/m 

[653-667] 


199 


[1-363] 


none detected 


[1-363] 


[474-479] 


r /i ao c 1 a 1 

[498-514] 


200 


[41-337] 


none detected 


[41-337] 


None detected 


[401-462] 


201 


[1-551] 


none detected 


[1-551] 


None detected 


[535-551] 


202 


[34-315] 


none detected 


[34-315] 


None detected 


rc*> >i c c 

[534-550] 


203 


[1-315] 


none detected 


[1-315] 


[371-376] 


[392-408] 


204 


[94-582] 


none detected 


r r\ a r <-» /"»"i 

[94-582] 


None detected 


[651-665] 


205 


[540-923] 


none detected 


[540-923] 


None detected 


[994-1008] 


206 


[77-364] 


none detected 


[77-364] 


[367-372] 


[391-455] 


207 


[65-544] 


none detected 


[65-544] 


[710-715] 


[733-749] 


208 


[117-467] 


none detected 


[117-467] 


[557-562] 


[578-594] 


! 209 


[893-1897] 


none detected 


[893-1897] 


[2066-207 1 ] 


r^AO^ ^AAOT 

[2082-2098] 


210 


[85-342] 


none detected 


[85-342] 


None detected 


r j 1 a iA oi 

[412-428] 


21 1 


[155-433] 


none detected 


[155-433] 


[713-718] 


[735-769] 


212 


[63-386] 


none detected 


[63-386] 


[878-883] 


f O AO Al ill 

[898-914] 


213 


[460-1290] 


none detected 


r a s~ f\ i a ai 

[460-1290] 


[1449-1454] 


ri 1 ii oni 

[1473-1489] 


214 


[21-539] 


none detected 


r ^ i c ^ ai 

[21-539] 


n /ii ^T/i^i 

[741-746] 


[760-776] 


215 


[34-1 143] 


none detected 


[34-1 143] 


n nr i ioai 

[1375-1380] 


[1397-1412] 


216 


[6-1184] 


none detected 


[6-1 184] 


[1735-1740] 


[1744-1773] 


1 217 


r/*» a •">*■? 

[29-376] 


none detected 


[29-376] 


None detected 


[1 lo4-125 1 J 


218 


no c s r~\ 

[78-566] 


none detected 


no cn/i~\ 

[78-566] 


roc o OiCi l 
[85o-o63J 


rO*70 OA/11 

[0 /o-oy4j 


219 


[16-705] 


none detected 


[16-705] 


[860-8/3 J 


r OA/1 O 1 AT 

[oy4-SJlUJ 


220 


[103-405] 


J A. A. J 

none detected 


r 1 A1 /I Af 1 

[103-405] 


r >i 0^ >i 0*71 

[482-487] 


rc AO C 1 AT 

[503-519] 


221 


[72-350] 


none detected 


[72-350] 


r/A^ f AA1 

[593-598] 


[616-632] 


222 


[3o-43oJ 


none detected 


[3o-43oJ 


None detected 


[O^O-ODZJ 


223 


[38-322] 


none detected 


[38-322] 


None detected 


[634-650] 


224 


[202-480] 


none detected 


[202^180] 


[472-477] 


[488-502] 


225 


[171-1670] 


none detected 


[171-1670] 


[1706-1711] 


[1725-1739] 
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226 


[199-618] 


none detected 


[199-618] 


[626-631] 


[643-657] 


227 


[182-481] 


none detected 


[182-481] 


None detected 


[874-888] 


228 


[161-517] 


none detected 


[161-517] 


None detected 


[701-716] 


229 


[86-505] 


none detected 


[86-505] 


[618-623] 


[638-654] 


230 


[56-382] 


none detected 


[56-382] 


[598-603] 


[619-635] 


231 


[56-355] 


none detected 


[56-355] 


[597-602] 


[618-634] 


232 


[76-498] 


none detected 


[76-498] 


[546-551] 


[567-583] 


233 


[199-600] 


none detected 


[199-600] 


[705-710] 


[737-753] 


234 


[211-612] 


none detected 


[211-612] 


[717-722] 


[746-762] 


235 


[5-259] 


none detected 


[5-259] 


[502-507] 


[521-537] 


236 


[23-370] 


none detected 


[23-370] 


[956-961] 


[978-994] 


237 


[41-352] 


none detected 


[41-352] 


None detected 


[646-662] 


238 


[3-1319J 


none detected 


ri 111 on 
[J-l J iyj 


n*701 1 

[1 /y 1-1 /yo] 


[io 1 j- 1 oZ!/J 


239 


[421-768] 


none detected 


[421-768] 


[1045-1050] 


[1067-1083] 


240 


[78-590] 


none detected 


[78-590] 


None detected 


[1815-1831] 


241 


| [78-608] 


none detected 


[78-608] 


None detected 


[1814-1830] 



492 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 



Table III 



List of variants 

92; 119 

14;15 
1 10; 111 
69;174;76 

2;12 
172; 176; 177 
ISO; 152; 164; 166 
154;162 
77; 143 

34;62 
230-231 

63;68 

8;47 
48;49;66 

7;72 
160;161 
144; 175 

17;21 

31;32 
5;6 

3;10 
96;121 
37;41;59 

70;71 

19;24 
186;195;204 

73; 74 
240;241 
221;235 
222;223 

42;45 
157;163 
190;229 
117;137 
122;233;234 
201;202 
80; 139 
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Table IV 



Sea Id No 


Preferentially excluded fragments 


1 


192..235;2099..2201 


2 


174..225;1605..1631 


3 


1111. .1245 


4 


1590.. 1598;1607.. 1623 


5 


1385. .1453 


6 


1571. .1639 


7 


1732.. 1768 


8 


1494..1510 


9 


570 882 


10 ' 


1176 1218*1710 1742*1833 1849 


1 1 


219 253*455 565 


12 


178 229 1636 1663 


1 ^ 
1 J 


729. .744 


14 


790 827* 1735 1759 


1 s 


788 825 1731 1755 


16 


922. .936 


1 7 


668 747 


1 ft 
I o 


1870 1884 


1 9 


677. .691 


70 


1 124. .1 138 


i 


450 468 




393 41 1*706 720 




713 727 ! 


74 


456. .470 


2*S 


876 928*973 987 


26 


894. .908 


27 


! 748. .762 


28 


1088. .1102 


29 


422..436 ; 


30 


1879. .1918;1923.. 1938 j 


31 


774..1116 


32 


772. .1114 


33 


2056..2072 


34 


393. .409 


35 


! 784.. 836 


36 


544..551;1307..1323 


37 


1867..1874;1929..1945 


38 


j 1315..1330 


39 


2108..2124 


40 


413..421;1 1 16..1 159 


41 


1863..1870;1936..1953 


42 


1623..1688 
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43 


1895.. 1942 


44 


1640.. 1657 


45 


1661. .1733 


46 


1555. .1871 


47 


1507. .1523 


48 


541. .832 


49 


540.. 831 


50 


901. .917 


51 


2..10;605..621 j 


52 


585..673 


53 


885..897 


54 


4..13;761..1101 


55 


1031..1047 


56 


873..905;907..923 


57 


1224.. 1240 


58 


861. .902 


59 


1842..1849;1955..1969 


60 


11 16.. 1132 


61 


15..46;615..631 


62 


651. .722 


63 j 


1426.. 1442 j 


64 


739.-795 


65 


1220.. 1236 


66 


520..881 


67 


413. .524 j 


68 


1444.. 1472 


69 


1721. .1737 


70 


1621-1637 


71 


1620-1636 


72 


777..784;1742..1758 


73 


1631. .1647 


74 


1630..1646 


75 


1947-1963 j 


76 


1741..1757 


77 


1561..1913;2011..2027 


78 


727-8 1 9;880..894;90 1 .. 1 280; 1 841 .. 1 880 


79 


418-584 


80 


331..353;844..1214;1337..1351 j 


81 


706..720 


82 


639..713;1008..1029 


83 


1454..1788 


84 


712-805 


85 


800-814 


86 


584-598 


87 


122..308;593..699 


88 


855..905 
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89 


500..514 


90 


81..101;198..205;504..518 


91 


650..808 


92 


128..201;723..737 


93 


714..728 


94 


568..S82 


95 


1761. .1773;1898.. 1913 


96 


654..670 


97 


883..938 


98 


616..661 


99 


631. .647 


100 


853.. 1006 


101 


537. .544;949.. 1059 


102 


498. .514 


103 


1142..1158 


104 


1524..1563 


105 


1230..1259;1606..1621 


106 


505..557 


107 


584..600 


108 


378. .385;1113.. 1129 


109 


729..778 


110 


992.. 1301 


111 


991..1300 


112 


1131..1139;1569..1617 


113 


1526.. 1634 


114 


457..509;677..693 


115 


768..784 


116 


360..670;788..804 


117 


435.-484 


118 


433..452;764..985 


119 


128..201;801..839 


120 


554..564;567..583 


121 


872..908;1008..1024 


122 


744..760 


123 


1 578..594 


| 124 


94..102;248..559 


125 


728..744 


126 


809..824 


127 


510..526 


128 


602..618 


129 


472..553;569..776 


130 


983..998 


131 


396..468;763..779 


132 


478..532;1009..1025 


133 


592..607 


134 


761. .774 
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135 


CCS C /TO . ZTl 1 1 1 

556..563;60l..6l 1 


136 


887. .919 


137 


658. .674 


138 


1651.. 1725 


139 


A C\ *7 1 AO O 1 TCO.1 /iCO 1 A 

49..71;988.. 1358; 1458.. 1474 


140 


324. .653 


1 4 1 

141 


T1A A. 1 y|/|A 1/1 OH 

720..730; 1449.. 1490 


142 


44.. 1 19;498..505;578..585;645..ool 


143 


ll^O 1 *Z£L£L. 1 "7*71 1*700 

1322.. 1666; 1773.. 1789 


144 


1 O^O 1 OA*7 i A1 A 1 O^O. 1 AAA 'tAA/I 

1 828.. 1 897; 1 9 1 9.. 1 968; 1 990. .2006 


145 


A"> ti a r r i azta i aa^ 

936.. 955; 1 060.. 1096 


146 


778. .827; 1650.. 1666 


147 


1 170. .1207;1647.. 1687 


148 


1733.. 1747 


149 


579. .658 


150 


1432.. 1440; 1 728.. 1778;2004.. 2045 


151 


772. .788 


152 


1 496.. 1 504; 1 792. . 1 842; 1 9 1 5 1 93 1 


153 


500. .514 


154 


1 167. .1 183 


155 


1529.. 1545 


156 


703.. 1068 


157 


873. .881;1081.. 1097 


158 


878. .894 


1 159 


688. .703 


160 


833. .849 


161 


830. .846 


162 


1 159.. 11 76 


163 


869. .876;1068.. 1084 


164 


1444.. 1463; 1496.. 1 504; 1 743.. 1 793 


165 


1233.. 1 3 1 9; 1 697.. 1 849 


166 


1407.. 1426; 1 459.. 1467; 1 694.. 1 748 


167 


1258. .1275 


168 


a a 1 />y-v 1 a a o i at> 

84..129;1002..1023 


169 


m ^ y A r- r\ S~ y f\ A /"T1 ^" a A *7 A ^ /I . A A C 1 A O C 

436. .472;596..604;673..689;732..954;995.. 1085 


170 


767. .776 


171 


1 l a i a 

1203.. 1219 


172 


1 /I 1 1 1 Vl 0*7 

I4l I. .1487 


173 


1861. .1915 


174 


1 r*"7 >1 1 AAA 

1974. .1990 


175 


lOAA iOZ"A 1 O A 1 1 ft^A. 1 OCC 1 AT 1 

1 800.. 1 869; 1 89 1 .. 1 940; 1 955. . 1 97 1 


176 


1597. .1613 


177 


1 OO..Z 1 /, 1 Z / / ..lJOl 


178 


930..978;1002..1113 


179 


951..1000;1364..1533;1944..1960 


180 


1427..1443 
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181 


107..181;276..31 1;449..605 


182 


1 143. .1450,1677. .1724 


183 


1. . 25 1;648.. 655; 1347. .1686 


184 


447. .463 


185 


150..159;623..773 


186 


340..476;739..753 


187 


740.. 754 


188 


307..315;668..998 


189 


1 18..125;529..536;591..605 


190 


492. .526 


191 


872. .910 


192 


52S..668 


193 


91..135;461..637 


194 


392..458;551..671;692..706 


195 


656. .670 


196 


283..379;458..466;496..5 10 


197 


1..96;483..500 


198 


625. .667 


199 


474..513 


200 


370..462 


201 


535. .551 


202 


534..550 


203 


374..408 


204 


651.. 665 


205 


994.. 1008 


206 


348. .455 


207 


733..749 


208 


1..49;578..594 


209 


2082. .2098 


210 


412. .428 


211 


689. .769 


212 


898. .914 


213 


1266.. 1489 


214 


760. .776 


215 


1304. .131 1;1383.. 1412 


216 


648. .691 ; 1711.. 1773 


217 


644. .856;910.. 1251 


218 


878. .894 


219 


894. .910 


220 


503. .519 


221 


616. .632 


222 


636..652 


223 


634. .650 


224 


50..57;488..502 


225 


534..577;1725..1739 


226 


643..6S7 
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227 


1..84;874..888 


228 


701..716 


229 


638.-654 


230 


263..573;619..635 


231 


263..573;619..635 


232 


S67..583 


233 


737-753 


234 


746.-762 


235 


499.-537 


236 


905-91 2;944..994 


237 


348..662 


239 


829.. 1083 


240 


1508..1831 


241 


1507-1830 
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Table Va 



Seq Id 
No 


Preferentially excluded fragments 


Preferentially included fragments 


1 


[1 -540];[556-6 1 5];[2061 -2096];[2098-220 1 ] 


[541-555];[616-2060];[2097-2097] 


2 


[1-51 l];[533-619];[621-690];[730-l 132] 


[512-532];[620-620];[691-729];[l 133-1631] | 


3 


[2-539];[l 178-1245] 


[1-1];[540-1177] j 


4 


[l-250];[297-383];[386-514];[1025-1064] 


[251-296];[384-385];[515-1024];[1065-1623] 


5 


[27-1 16];[1 18-391] 


[1-26];[117-117];[392-1454] j 


6 


[l-93];[96-168];[170-262];[264-461] 


[94-95] ; [ 1 69- 1 69] ; [263-263] ; [462- 1 63 9] 


7 


[l-95];[97-451] 


[96-96]; [452- 1768] 


8 


[1-502];[1314-1491] 


[503-1313];[1492-1510] 


9 


[1-864] 


[865-882] 


10 


[1-428] 


[429-1849] 


11 


[1-454]; [482-5 14] : 


[455-481];[5 15-565] \ 


12 


[l-375];[379-511];[533-690];[730-783];[814- 
1164] 


[376-378];[512-532];[691-729];[784-813];[1165- 
1663] 


13 


[2-337];[339-556] 


[l-l];[338-338];[557-744] 


14 


[29-366];[368-507] 


[1 -28]; [367-367];[508-l 759] 


15 


[29-366];[368-524] 


[l-28];[367-367];[525-1755] 


16 


[1-641] 


[642-936] 


17 


[l-708];[7 11-747] 


[709-710] j 


18 


[1-639] 


[640-1884] 


19 


[1-631] 


[632-691] 


20 


[3-416];[418-490] 


[1-2];[417-417];[491-1138] 


21 


[1-468] 


None 


22 


[1-720] 


None 


j 23 


[1-711] 


[712-727] 


24 


[1-469] 


[470-470] | 


25 


[l-231];[234-488] 


[232-233];[489-987] 


26 


[l-296];[300-642];[644-737] 


[297-299];[643-643];[738-908] 


27 


[l-306];[308-762] 


[307-307] 


28 


[1_446];[448-1102] 


[447-447] 


29 


[1-436] 


None 


30 


[7-334];[1420-1468];[1474-1614];[1616- 
1804];[1845-1919] 


[l-6];[335-1419];[1469-1473];[1615- 
1615];[1805-1844];[1920-1938] 


31 


[1 -342];[345-51 9];[823-893];[977-l 01 6] 


[343-344];[520-822];[894-976];[1017-l 1 16] 


I 32 


[1-517];[821-891];[975-1014] 


[5 1 8-820] ; [892-974] ;[ 1 015-1114] 


33 


[36-352];[354-457];[728-832];[834- 

1096];[1253-1289];[1291-1350];[1352- 

1412];[1726-1873] 


[l-35];[353-353];[458-727];[833-833];[1097- 
1252];[1290-1290];[1351-1351];[1413- 
1725]; [1874-2072] 


34 


[1-409] 


None 


35 


[14-105] 


[l-13];[106-836] 


36 


[1-572];[1 120-1271] 


[573-1 119];[1272-1323] 


37 


[20-98];[100-510];[1591-1681];[1683-1870] 


[1-19];[99-99];[511-1590];[1682-1682];[1871- i 
1945] 



500 



BNSDOCID: <WO 01 42451 A2 I > 



WO 01/42451 



PCT/IB00/01938 



18 


T 1-5471 


1548-13301 


19 


ri-4451 


111 V I A»^TJ 


40 


M -47nr475-S781 
[ i ***t t j J>[^ * J ■'^oj 


r474-4741T529-l 1 591 


41 


ri6-5061-n 587-1 8661 

[ 1 U JVUJ ,[ 1JO / 1 O UUJ 


Tl -151T507-1 5861 ri 867-19511 


49 


r7-7141 • f 2zl4-45 1 1 • TQ74- 1 2261 


ri-lir235-243Vf452-971iri 227-16881 


43 


[1-455];[1670-1925] 


[456-1669];[1926-1942] 


44 


[1-579];[815-1031] 


[580-814];[1032-1657] 


45 


[1-489];[1012-1264] 


[490-101 1];[1265-1733] 


46 


[1-400];[1184-1223];[1225-1705];[1740-1818] 


[40 1 - 1 1 83] ;[ 1 224- 1 224] ; [ 1 706- 1 73 9]; [ 1 8 1 9- 
lo/l J 


47 


n -S99i-n ^26-1 sosi 

[ I -.J2--7J 5 [ 1 JtO" 1 -/V/JJ 


TS10-1 12S1-T1 S06-1 S211 


48 
*+o 


ri - i 1 1 i-n ii-s i ows60-sR9i 


n 12-1 121T5 1 l-559ir590-8121 


4Q 


ri-iiovrii7 soowsso-srri 


riii-iin-rsi o.ssri - rs80-8i 1 1 


SO 


f 1 -6S0ir657-R6RirR71-91 11 
[ i -ojvj ,[uj^-ovjoj , [o / j-y i j j 


T6S1 -65 11T869-872W9 14-91 71 


51 


[l-504];[515-605] 


[505-5 14];[606-621] 


!>Z 


[ 1 -j33J 


rc'i/: A711 




ri ^aii 
[Z003J 


r 1 1 l»rSA4 R071 


j4 


n KiTi-fsmo QTm-ffio Q7/ii-rQAA 

[lOZ /J,[oUZ-o /UJ,[ooZ-7D'tJj[yOO- 

10181N037-10801 


r^7S R011-TR71 RRIVTOIS OAS1*ri01Q 

1036];[1081-1101] 


SS 


n 17A1-ri7R SOSI 


ri77 177irS0A-1 0471 


SA 


ri 1401 


T141 07S1 


S 7 


[lOZoJ 


TS7Q 1 7401 1 


S8 


n insn-nis isii-riS4 i40i«ri47 S7Q1 

[1-1UoJ,[1 Jj-131J,[l [ j-+ZOZ:/J 


nOQ 1141-T1S7 1 S11-T141 -14 11* rSlO-0071 
[i uy-i 1-+J ,i 1 jZ- 1 j J J, [D*+ 1 OH I J,[jDU-yUZJ 


SQ 


P4-dRSl-r 1 SAA 1ASA1-MASR 1 R4S1 
[*t--*O.Jj,[ 1 jDO-IODOJ, [ 1 OjO- 1 0-+3J 


ri 1VF4RA 1 SASl-n AS7-1 AS71-n 84A-1 9691 


ao 


ri 7Rii 


r?R4-1 1 171 
[ZO*r- 1 1 z>zj 


A1 

D 1 


TQ-4A81 
[I7-*+OOJ 


ri -81*r469-Al 11 


oz 


f1 S7SVTAR0 7771 
[10Z.Jj,[Oo;7- /zzj 


rS7A-ARRl 


61 


ri SRi rOO 1071M04 76SW7Q6-4001 
[ i -ooj,[yu- 1 yzj,[ i j j[z^o-^fuyj 


rR9-89i ri 91-1 911-r76A-79Sl-r41 0-1zl471 


A4 


n si 71 

[I-DI /J 


TS 1 R-79S1 

[J 1 O- / 7J J 


AS 


ri-40A1-r40S 7101 


T407-4071 • r740- 1 7 161 


66 


[1^89];[849-881] 


[490-848] 


A7 


ri ^n^i 


rcnz -;74l 


AC 
Do 


1 1 OZDJ,[JZo--+--* 1 J, [-4-4-4 OU-+J 


r^7A i77i«r42i7-ii/in*rsns 14771 


AO 


ri S741«rAlA 71 Sl»r71 7 ROQl-TRl 1 RRSW1SA7 
[ 1 OZ-4j,[0,50- / 1 jj,[ / 1 /-0Wyj,[o 1 1-oo.Jj^l.JO/- 

1715] 


rS7S A1S1-T71A 71A1TR10 R101-rRR6 

1566];[1716-1737] 


70 


n 7-4871 

[ 1 Z-HO / J 


ri -1 1 1-F488-1 6171 

1 I — 1 1 J y [tOO 1 \JJ f J 


71 
/ i 


ri 7-4871 

[ 1 ^"tO / J 


Tl-l li r488-16361 


72 


ri-4sn 
ii-^j i j 


T452-17581 

Y^TjJm 1 / JOJ 


73 


[l-167];[242-464] 


[168-241];[465-1647] 


74 


ri 1 A71-T947-4A41 

[ 1 - 1 O / J , [Z-4Z--40HJ 


ri 6R-74ll-r46S-1 6461 

[ 1 00-Z*t 1 J, [HO J 1 Q*4UJ 


75 


[1-471] 


[472-1963] ; 


/O 


n ^d-r^An ^/m«rA^^ 7i/i"i'r7iA Roci-rc^n 

[1 -Jj8J,[3oU-j4JJ,[OD j-/34J,[/jO-ozoJ,[oOU- 

904];[1586-1734] 


r^^o i-;qi- t^AA A-;zn*r7is 7isi-rc7o cooi-ron-x 

Ijjy-oDy \,yj £ fH-ODH],[ / jj-/ j j J,[oZV-oZ^J,[yU3- 

1585];[1 735-1 757] " | 


77 


ri 141-riA_474l-rSS7 7701-M700 174A1-H74R 
[J-J^J^jO-^ /-4j,[_)OZ- / /UJ,[i /U7-1 '^OJ,[l /HO* 

1785];[ 1825-1 899] 


n 71-riS lSl-r47S-SRll-r771 -1 70R1T1 747- 
[ 1 -ZJ,[ j J- J J J, [H / j-jO *J>[' ' * - / v/oj ,[ 1 /H /- 

1 747];[1 786-1 824];[1 900-2027] 


78 


[l-75];[77-319];[914-1052];[1063- 
1126];[1 168-1203] 


[76-76];[320-913];[1053-1062];[1127- 

1167];[1 204-1 880] 1 
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79 


[1-425] 


[426-584] 


80 


[l-752];[947-1017];[1084-1170] 


[753-946];[1018-1083];[l 171-1351] 


81 


[1-496]; [498-720] 


[497-497] 


82 


[1-324] 


[325-1029] 


83 


[1-477];[1474-1529];[1537-1566];[1577- 
1616];[1622-1662];[1717-1753] 


[478-1473];[1530-1536];[1567-1576];[1617- 
1621];[1663-1716];[1754-1788] 


84 


[l-496];[499-568];[752-805] 


[497-498]; [569-751] _j 


85 


[1-527] 


[528-814] 


86 


[1-360] 


[361-598] 


87 


[l-78];[80-583];[625-699] 


[79-79];[584-624] 


88 


[1-889] 


[890-905] 


89 


[1-513] 


[514-514] 


90 


[l-122];[124-155];[157-435];[437-517] 


[123-123];[156-156];[436-436];[518-518] 


91 


[l-133];[165-808] 


[134-164] 


92 


[1-725] 


[726-737] 


93 


[1-409] 


[410-728] 


94 


[1-331] 


[332-582] 


95 


[1-410] 


[411-1913] 


96 


[1-501] 


[502-670] 


97 


[1-141];[143-431] 


[142-142];[432-939] 


98 


[1-193] 


[194-661] 


99 


[1-629] 


[630-647] I 


100 


[l-520];[862-954];[976-1005] 


[521-861];[955-975];[1006-1006] 


101 


[1-489];[581-961];[1010-1059] 


[490-580]; [962- 1009] 


102 


[1-485] 


[486-514] 


103 


[1-540] 


[541-1158] 


104 


[1-556] 


[557-1563] 


105 


[l-868];[870-1006] 


[869-869];[1007-1621] 


106 


[1-491] 


[492-557] 


107 


[1-573] 


[574-600] 


108 


[l-457];[586-1110] 


[458-585];[l 111-1 129] j 


109 


[l-521];[655-778] 


[522-654] 


110 


[ 1 -4 1 6] ;[478-6 1 4]; [6 1 6-990] ; [992- 
1065];[1068-1283] 


[417-477];[615-615];[991-991];[1066- 
1067];[1284-1301] 


111 


[1-416];[478-614];[628-989];[991- 
1064];[ 1067-1 282] 


[417-477];[615-627];[990-990];[1065- 
1066];[1283-1300] 


112 


[2-429];[1161-1202];[1212-1388];[1392-1589] 


[1-1];[430-1160];[1203-1211];[1389- 
1391];[1590-1617] 


113 


[1-487] 


[488-1634] 


114 


[l-70];[86-496] 


[71-85];[497-693] 


115 


[l-358];[360-558] 


[359-359];[559-784] 


116 


[l-215];[218-495];[527-607] 


[216-217]; [496-526] ; [608-804] 


117 


[1-466] 


[467-484] 


118 


[l-515];[906-963] 


[516-905];[964-985] 


119 


[l-744];[746-816] 


[745-745];[817-839] 


120 


[l-85];[87-521] 


[86-86];[522-583] 
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121 


[1-532] 


[533-1024] 


122 


[l-318];[325-517];[567-660] 


[3 1 9-324]; [5 1 8-566] ;[66 1 -760] 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[428-559] 


125 


[1-642] 


[643-744] 


126 


[l-341];[350-696] 


[342-349]; [697-824] 


127 


[1-482] 


[483-526] 


128 


[1-338] 


[339-618] 


129 


[l-191];[193-429];[450-678] 


[192-192];[430-449];[679-776] 


130 


[19^163]; [465-544] 


[1-18]; [464-464] ; [545-998] 


131 


[1-470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


[499-607] 


134 


[l-168];[170-326];[328-471];[552-738] 


[169-169];[327-327];[472-551];[739-774] 


135 


[l-346];[348-395];[440-473] 


[347-347];[396-439];[474-611] j 


136 


[l-324];[343-436] 


[325-342];[437-925] 


137 


[1-186];[188-251];[255-517] 


[1 87-1 87];[252-254];[5 18-674] 


138 


[1-488] 


[489-1725] 


139 


[1-101];[103-190];[292-327];[1091- 
1161];[1228-1314] 


[102-102];[191-291];[328-1090];[1162- 
1227];[1315-1474] 


140 


[l-465];[5 16-653] 


[466-515] 


141 


[1-761];[763-857];[912-1326] 


[762-762];[858-91 1];[1327-1490] 


142 


[1-476] 


[477-661] 


143 


[1-531];[1471-1508];[1510-1547];[1587-1661] 


[532-1470];[ 1 509-1 509];[1 548-1 586];[ 1 662- 
1789] 


144 


[l-492];[503-536] 


[493-502];[537-2006] 


145 


[1-570] 


[571-1096] 


146 


[ l-536];[62 1 -703];[729-l 075];[ 1 1 98-1445] 


[537-620];[704-728];[1076-l 197];[1446-1666] \ 


147 


[l-555];[578-628] 


[556-577];[629-1687] 


148 


[1-444] ; [1201-1474];[1480-1516] 


[445-1200];[1475-1479];[1517-1747] 


149 


[l-613];[626-658] 


[614-625] 


150 


[4- 1 99];[20 1 -4 1 9];[42 1 -492] 


[l-3];[200-200];[420-420];[493-2045] 


151 


[1-509] 


[510-788] 


152 


[l-483];[485-578] 


[484-484]; [5 79- 1931] 


153 


[1-497] 


[498-514] 


154 


[5-509];[579-763];[765-l 162] 


[l-4];[510-578];[764-764];[l 163-1 183] 


155 


[l-486];[ 1095- 1500] 


[487-1094];[1501-1545] 


156 


[l-488];[740-797]; [799-884] ;[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-161];[163-565];[567-701] 


[162-162];[566-566];[702-1097] 


158 


[l-496];[692-754] 


[497-69 1];[755-894] 


159 


[1-483] 


[484-703] | 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[l-505];[575-759];[761-l 164] 


[506-574];[760-760];[ 1 1 65-1 1 76] 


163 


[1-699] 


[700-1084] 


164 


[38-483];[485-556] 


[ 1 -37];[484-484];[557-l 793] 
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165 


1 -426];[1 303- 1444];[1 71 7-1 755];[1 787-1 825] 


427- 1302];[ 1445-1 7 16];[1 756-1 786];[1 826- 
1 8491 


166 


r2-2641 • T266-4461 • T448-5 1 91 


[l-l];[265-265];[447-447];[520-1748] 




ri-519Vf523-5521 


520-522];[553-1275] 


168 

1 \JO 


n -45 71 * T466-5 711 


[458-465];[572-1023] ; 


160 


ri-541*r57-5011 


[55-56];[502-1085] 


170 


r i-54 li 


[542-776] 


171 


[1-489] 


[490-1219] 


1 79 
1 It. 


M -538VT977-1 4681 


T5 39-9761 ■ \ 1 469- 1 4871 


1 71 


ri 6^11 


T632-19151 


1 74 


F9 1 -776ir888-967Vr969-1061 VN063- t 
1137];[1819-1967] 


ri-20Vr777-887ir968-968iri062-1062iril38- 
1818];[1968-1990] 


1 75 


[1-508] 


[509-1971] 


1 76 


Tl-1 27V T 129-53 81- T979-14701 


[128-128];[539-978];[147 1-1613] 


1 77 
ill 


ri-535Vr973-l 1 731* Tl 177-1330Vri332-136n 


[536-972];[l 174-1 176];[1331-1331] 


1 78 

I/O 


ri -5991- r626-8101-n 082-1 1 131 


T600-6251 • T83 1 - 1 08 1 1 


179 


[1-623];[1377-1406] 


[624-1376];[1407-1960] 


1 8fi 
1 oU 


n .A 1 A\ - [A 1 8-4641 


T41 5-41 71 -T465- 14431 

[*T1J— tl / J , [*TVJ 1 1 TJJ 


181 


[l-522];[533-587] 


[523-532];[588-605] 


1 Q1 


12741 T 1284-1 3 191'fl 385-141 61 


T7Q Q81T1 39-1 351-T398-1 1 59V F1 1 85- 
1209];[1275-1283];[1 320-1 384];[1417-1724] 


1 


T1 S191-T617 805VT871 -959VT1 387- 
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120 


[l-85];[87-521] 


[86-86];[522-583] 


121 


[1-532] 


[533-1024] 
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122 


[1-31 8];[325-5 1 7];[567-660] 


[319-324];[518-566];[661-760] 


123 


[1-498] 


[499-594] 


124 


[1-427] 


[428-559] 


125 


[1-642] 


[643-744] i 


126 


[l-341];[350-696] 


[342-349];[697-824] 


! 127 


[1-482] 


[483-526] 


128 


[1-338] 


[339-618] 


129 


[l-191];[193-429];[450-678] 


[ 1 92-1 92];[430-449];[679-776] 


130 


[19-463];[465-544] 


[i-18];[464-464];[545-998] 


131 


[1-470] 


[471-779] 


132 


[1-533] 


[534-1025] 


133 


[1-498] 


[499-607] 


134 


[l-168];[170-326];[328-47l];[552-738] 


[ 1 69-1 69];[327-327];[472-55 1 ];[739-774] 


135 


[l-346];[348-395];[440-473] 


[347-347];[396-439];[474-61 1] 


136 


[l-324];[343-436] 


[325-342];[437-925] 


137 


[1-186];[188-251];[255-517] 


[ 1 87-1 87];[252-254];[5 1 8-674] 


1 Jo 


[1 -4ooJ 


/ZD J 


139 


ri lAii.riAi ioai-hoo lon.nnoi 
[1-101J;[ 103-1 yuj, [zyz-3z /J;liuy l- 
l 1611T1228-1 3 141 


r 1 /v> moi.riQi 0011. ri9G 1 nortvn 1 
[ 1 uz- 1 uzj,[ 1 y 1 -zy 1 j,[3zb- 1 uyuj,[ 1 1 oz- 

1227iri315-14741 




[1-^D3J,[31 O'Dj j J 


rAfifi si si 

[*»uu"j 1 J j 


1 A1 


Li-/oij,[ /o^-oj / j,[y i z - 1 jzoj 




1 A 9 




T477 fifill 


1/11 
143 


ri <iii-ri/i7i i^nci*n^in i^/i7i-ri^C7 
1661] 


rs77 i47m-nsno i sftQi-ri S48 I SRfil-M Afi^- 
17891 


144 


[l^92];[503-536] 


[493-502];[537-2006] 


145 


[1-570] 


[571-1096] 


146 


[1-536];[621-703];[729-1075];[1 198-1445] 


[537-620];[704-728];[ 1 076-1 1 97];[ 1 446- 1 666] 


147 


[l-555];[578-628] 


[556-577];[629-1687] j 


148 


[1-444];[1201-1474];[1480-1516] 


[445-1 200] ;[ 1 475 -1479] ; [1 5 1 7- 1 747] 


149 


[1-6 13]; [626-65 8] 


[614-625] 


150 


[4-1 99] ; [20 1 -4 1 9] ; [42 1 -492] 


[l-3];[200-200];[420-420];[493-2045] 


151 


[1-509] 


[510-788] 


152 


[l-483];[485-578] 


[484-484];[579-1931] 


153 


[1-497] 


[498-514] 


154 


[5-509];[579-763];[765-l 162] 


[l-4];[510-578];[764-764];[l 163-1 183] 


155 


[1-486];[1095-1500] 


[487-1094];[1501-1545] 


156 


[l-488];[740-797];[799-884];[895-974] 


[489-739];[798-798];[885-894];[975-1068] 


157 


[l-161];[163-565];[567-701] 


[162-162];[566-566];[702-1097] 


158 


[l-496];[692-754] 


[497-69 1];[755-894] 


159 


[1-483] 


[484-703] 


160 


[1-494] 


[495-849] 


161 


[1-491] 


[492-846] 


162 


[l-505];[575-759];[761-1164] 


[506-574];[760-760];[l 165-1176] j 


163 


[1-699] 


[700-1084] 


164 


[38-483];[485-556] 


[l-37];[484^84];[557-1793] 


165 


[1-426];[1303-1444];[1717-1755];[1787- 
1825] 


[427-1302];[1445-1716];[1756-1786];[1826- 
1849] 
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1 A< 

loo 


f> 'JAAI-CJAA-AAAl'tAASl S 1 01 
Z-Z04J J [Z00-*f^0J,l £ fH0O iyj 


n .1 1 f 96^-76 *>ir447-4471-r^?n- 17481 


lo/ 


n <ioi-r^"5i <^7i 
lO Dzj 


rs7n-s?7"wss'}-i ?7Si 


1 Do 




r4S8-46SVr572-10231 


169 


[l-54];[57-501] 


[55-56];[502-1085] j 


1 "7A 

170 


ri c a 1 1 

[1-541 J 


[j*fZ- / /OJ 


171 


[1-489J 


NQA 171Q1 


172 


[ 1-53 8J; [977-1 46oJ 


f*S10 07/^1^1/1/^0 145271 


173 


[1-63 1] 


rir-50 1 Q1 SI 


1 *1 A 

174 


[2 1-776]; [888-967], [969- lOol J,[ 1U63- 


ri 7m-r777 fifi7i-roAfi Q^svnn^? ifi£7i-ri i *}R- 
[ i -zuj,[ ///-oo / j,[yoo-yooj,Li uoz- iuozj,[i i jo- 

1 8 1 8 V T 1 968-1 9901 


175 


[l-50oj 


[3Uy- 1 y / 1 J 


1 76 


ri mi.rnn cioi-roio 1/1"7A1 
[ 1-1 27J;[ 129-53 8J, [979- 14 /UJ 


ri7» 1 7R1-K10 07R1-M471 l^l^l 


177 


[l-535];[973-1173];[1177-1330];[1332-1361] 


[536-972];[l 174-1 1 76];[ 1 33 1 -1 33 1 ] 


178 


[l-599];[626-830];[1082-l 113] 


[6UU-oz5J;[o3 1 -lUol J 


179 


[1-623];[1377-1406] 


[624-1376];[1407-1960] 


180 


[l-414];[418-464] 


r a x c a n *n r a z" c \ a aii 

[415-417];[465-1443] 


181 


[l-522];[533-587] 


[523-532];[588-605] 


182 


[ 1 -78] ; [99-1 3 1 ];[ 1 36-327];[ 1 153- 

1 1o4J 3 L1Z1U-1Z /4J,[lZo^-l^ 1 yjjL 1 JOJ-H lOJ 


r r\ aoi ri n i ici.n A o i 1 cil.n iOC I *>A01 . r 1 O tc 

[79-98];[132-135];[328-l 152J;[1 l55-1209J,Llz /5- 
1 7R^1- N ^9n_1^R4Vri41 7-1 7741 


183 


[ 1-51 2];[617-805];[871-952];[1 387- 

1 4771- T 1£71 

1 <+ZZJ,l 1 OZ 1 - 1 DO 1 J 


rc i *> <:i roA^ oiAi.rAci 1 TO/Il.ri /ill 

[5 13-61 6J;[806-870J; [953-1 JooJ;[l 423- 
1 6201- T 1662-1 6861 


184 


[1-453] 


[454-4o3J j 


185 


[1-773] 


None 


186 


[ 1 —4 1 3] ;[423 -604] ; [606-739J 


[414-4zzJ,[oUj-oU0J,[ /4U-/jjJ 


187 


[1-1 17];[1 19-401] 


T1 1Q 1 1 Ol.f/IA') 7C/11 

[1 lo-l 1 oJ,[4UZ-/^4J 


188 


[1-51 l];[684-870];[872-928];[935-95l J 


rcn /^cji-rc*?! c7n-ro7Q oi/ii-roco OOQ1 
[j iz-ooJj,[o / i-o / 1 j,[yzy-yj4j,[yoz-yyoj 


189 


[1-605] 


None 


190 


[2-475] 


r 1 1 1« T/1*7A ^7^1 
[1-1 J,[^/OOZOJ 


191 


[1-910] 


None 


192 


Tl 1 A1 l.f 1 A1 //Ol 

[1-I0lj;ll03-668J 


1 1 UZ-1UZJ 


1 A "5 

193 


[ 1-j2UJ;15o3-o3 /] 


[OZ 1-DoZJ 


194 


[1-7U6J 


None 


195 


ri t /ici.n ca /ic i i.r/i/C^ *c*7Al 
[ l - 1 45 J ; [ 1 50-45 l J , [4oo-o / UJ 


[ l'*0-l*4yj,['+DZ-HOJj 


lyo 


[ I -5UVJ 


[0 1U-3 1 vj 


iy / 


ri <aai 
[ 1 -DUUJ 


None 


1 OQ 

1 Vo 


ri ^mi.r<n< ^csi 

[ 1 OUjJ,[DUD-jB->J 


r s 04-S041 ♦ rs 86-^671 

[ J U4 _ JuH J, [ J OO-DO / J 


1 QO 


r i /tool 

[ i -4yoj 


MOO -S141 1 


! inn 
ZUU 


[ l-**OZJ 




OA 1 

2U 1 


[lOMJ 


None 


OAO 

ZUZ 


Ll-4oZJ,[4o40 


[HO J-HO J J 


701 


N -4081 


None 


204 


[1-5 19]; [52 1-649] 


[520-520];[650-665] 


205 


[l-261];[263-415];[417-640];[642-782] 


[262-262];[416-416];[641-641];[783-1008] 


206 


[1-455] 


None 


207 


[l-402];[410-526] 


[403-409];[527-749] 


208 


[1-520] 


[521-594] 
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209 


[l-197];[200-472] 


[198-199];[473-2098] 


210 


[1-311]; [3 14-427] 


[312-313];[428-428] 


211 


[l-689];[735-769] 


[690-734] 


212 


[1-517] 


[518-914] 


213 


r2-5761;r756-795ir 1390-14411 


[l-l];[577-755];[796-1389];[1442-1489] 


214 


Fl-4821 


[483-776] 


215 


r 1-4981 


[499-1412] 


216 ! 


r i -5051- r i ooo-i 293V r i 295- 1 408V r 1 744- 

1773] 


[506-999];[1294-1294];[1409-1743] 


217 


[l-102];[104-291];[293-467];[486-708];[723- 

831];[833-900];[910-1031];[1054- 

1090];[1097-1153] 


[103-103];[292-292];[468^185];[709-722];[832- 

832];[901-909];[1032-1053];[1091-1096];[1154- 

1251] 


218 


[1-452] 


[453-894] ! 


219 


[l-554];[556-598] 


[555-555];[599-910] 


220 


[l-38];[41-95];[98-386];[388-487] 


[39-40];[96-97];[387-387];[488-519] 


221 


[ l-34];[38-220];[222-335];[337-5 1 8] 


[35-37];[221 -22 1 ];[336-336];[5 1 9-632] 


222 


[1-468] 


[469-652] 


223 


[1-466] 


[467-650] 


224 


[1-466] 


[467-502] 


225 


[l-489];[653-1008] 


[490-652];[1009-1739] 


226 


[1-657] 


None 


227 


[1-480] 


[481-888] 


228 


[1-501] 


[502-716] 


229 


[1-612] 


[613-654] 1 


230 


[l-477];[485-538] 


[478-484];[539-635] 


231 


[l-476];[484-537] 


[477-483];[538-634] 


232 


[1-367]; [37 1-5 12] 


[368-370];[5 13-583] 


233 


[l-305];[307-442];[460-503];[553-646] 


[306-306];[443-459];[504-552];[647-753] 


234 


[l-260];[262-345];[347-454];[473-515];[565- 
658] 


[26 1 -26 1 ]; [346-346]; [455-472];[5 1 6-564];[659- 
762] 


235 


[1-427] 


[428-537] 




ri -4651 


1466-9941 


237 


[l-471];[496-526];[557-587];[597-637] 


[472-495];[527-556];[588-596];[638-662] 


238 


[l-338];[352-497] 


[339-35 1];[498-1 829] 


239 


[1-501] 


[502-1083] 


240 


[1-515];[1527-1583];[1585-1687];[1692- 
1831] 


[516-1526];[1584-1584];[1688-1691] 


241 


[1-515];[1526-1582];[1584-1686];[1691- 
1830] 


[516-1525];[1583-1583];[1687-1690] 
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Table VI 



Seq Id No 


Designation of domain 


Database 


Positions f 
domains 


242 


Cell attachment sequence 


PROSITE 


141-143 


242 


Peptidase family M20/M25/M40 


PFAM 


107-451 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


244 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


199-208 


244 


Mitochondrial carrier proteins 


PFAM 


5-84;87- 
175; 178-272 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


12-36 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


244 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


1 T 1 1 >l /I 

131-144 


245 


Leucine zipper pattern 


PROSITE 


371-392 


249 


Leucine zipper pattern 


PROSITE 


20-41 


251 


Mitochondrial energy transfer proteins 
signature 


PROSITE 


26-35 


ZJ 1 


1 VI 1 LvJV^l l\Jl 1U1 lal Uai I IVI JJlUlVIllo 


PFAM 


5-72 


9S1 

Z_> 1 


1V1 1 iUIUJJ Id I CI 1CI gjr llallolvl |JI Vl^Jlio. 


BLOCKSPLUS 


12-36 


251 


Mitochondrial energy transfer proteins. 


BLOCKSPLUS 


13-36 


O^A 


i^ancreauc riDonuciease iamijy signdiure 




6^-69 


254 


Pancreatic ribonucleases 


PFAM 


26-143 


ZZ>4 


T> A XT/^r> IT A T"!/^ DTDHMT T/^T C A CP T7 A 1VA1T "V 

r AJNUKJbA 1 KloUJN UV^JL-bAoli r /VIVllL I 

SIGNATURE 


dLULKji LUo 


AQ 


254 


Pancreatic ribonuclease family proteins. 


BLOCKSPLUS 


115-140 




SIGNATURE 


DLULyKoiL/U k3 


09-1 10 




SIGNATURE 




1 14-1^ 

I 1 *T I 




rdlltlCaUL 1 lUUHUtJCaoC laimij |JI UlvlIIo. 


RT OCKSPLUS 

JL>L<\yV/rVuI JLvV^kJ 


30-40 




PANfRFATTP RTRONTJCT FASF FAMTI Y 
SIGNATURE 


BLOCKSPLUS 


114-137 


254 


PANCREATIC RIBONUCLEASE FAMILY 
SIGNATURE 


BLOCKSPLUS 


69-86 


255 


L-lactate dehydrogenase active site 


PROSITE 


239-245 


255 


lactate/malate dehydrogenase 


PFAM 


71-380 


255 


T -lactate dehvdropena^e nroteins 


BLOCKSPLUS 


186-224 


255 


L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


96-121 


255 


L-Iactate dehydrogenase proteins. 


BLOCKSPLUS 


71-102 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


238-256 


255 


! L-LACTATE DEHYDROGENASE 
SIGNATURE 


BLOCKSPLUS 


183-203 


255 


L-lactate dehydrogenase proteins. 


BLOCKSPLUS 


288-323 ! 
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255 


L-LACTATE DEHYDROGENAob 

sir; NAT! rRF 


dt nrvcDi 1 TC 
BLUCKoPLUb 


zU /-zz4 


255 


T T A C^Y A TC FMTTJ\/T*>T> C\ClTiJ^( A CT7 

L-LAL lAlb Dhri YDKiJUtiN AoJb 
SIGNATURE 


dt nrv CPT T TC 


/ 1 -yz 


o c c 

255 


L-lactate dehydrogenase proteins. 


t*t r\r*\f CPT 1 TO. 


1^8 1 A 7 
1 jo-lO / 


256 


lactate/malate dehydrogenase 


DTT A \A 


71 1 7A 
/ 1 - 1 Z*+ 


256 


L-LACTATE DEHYDROGENASE 
SIGN ATI FRF 


BLOCKSPLUS 


96-121 


256 


L-lactate dehydrogenase proteins. 


DT nPl'CDT 1TC 


71 1 m i 
/ 1 - 1 uz 


256 


L-LACTATE DEHYDROGENASE 
STGNAT1 IRF 


BLOCKSPLUS 


71-92 


256 


L-lactate dehydrogenase proteins. 


DT HPT/ CPT T TC 


71 1 nn 


256 


T T a /"^■'"P A T 1 'C TMTTJ \/T"\D Z^f* XTXT ACE 

L-LAC 1 A 1 b DbnYUKUUtNAbb 
STGNATIIRF 


DT nPT/ CDT T TC 

B LlJLJs* 0 Pi- U o 


7 1 Rzl 


257 


Leucine zipper pattern 




ice i n(L 


259 


HUKJV1A domain 


pc A Tv/1 
1 r /V1V1 




1 

261 


Leucine zipper pattern 


rKUol 1 C 


149 1 


261 


Leucine zipper pattern 


rKUol 1 Jc 


1 70 101 


263 


Leucine zipper pattern 


DDHCTTC 


1 ^ 7 A i 
1 j-jO 


264 


Ubiquitin family 


DTT A \>f 

r r/\JVl 


1 87 
1 -oZ 


264 


Ubiquitin domain proteins. 


DT nrT/CPT TIC 


1 7 ^7 


264 


Ubiquitin domain proteins. 


DT nrFCT)T T TC 


7 1 /^R 
Z l-Oo 


264 


Ubiquitin domain proteins. 


DT APV CPT T TC 


7A /^fi 
ZO-Oo 


264 


Ubiquitin domain proteins. 


DT nrT/CDT T TC 


1 7 Aft 
1 /-Oo 


r\ f f 
266 


u-rAK/Ly-o domain 


DT7 A \A 

rr /\fVl 


110 


266 


Squash family of serine protease inhibit 


DC A TVyf 

rrAM 




267 


Zinc finger, C2H2 type, domain proteins. 


DT nrT/cpT T TC 


1 »^ 7n7 


271 


LBP / Br4 / lamiJy signature 


rKUoJ 1 Jt 


7 8 An 


271 


Pyrokinins signature 


PD/^CTnrrr 
rKUol 1 xi 


-17/1 770 


1 "7 1 

271 


T TDD / DDT / /^TTTT) fnmiK» 

Lbr / Brl / v^r,lr iamily 


DTT A Ayr 
r r AiVl 




271 


i^sSr 1 t>ri / i^ti r iamily proteins. 


dt nri/ CPT TTQ 


77 1 1 8 

/ z- 110 


27 1 


LrJr / Dri 1 Cblr iamily proteins. 


DT nPT/QPT T TC 


7ftQ 7^^ 


271 


LBP / BP1 / CETP family proteins. 


BLOCKSPLUS 


28-58 


271 


LBP / BPI / CETP family proteins. 


DT r^rVCDT T IO 

hs LULKorL U a 


07C 7flO 


271 


LBP / BPI / CEi P iamily proteins. 


Til nrvcDi tic 


7A 117 ' 


272 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


102-111 


272 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


87-129 


272 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


102-111 


273 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


30-39 


273 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


15-57 


273 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


30-39 


274 


RNA 3'-terniinal phosphate cyclase signature 


PROSITE 


157-167 


274 


RNA 3 , -terminal phosphate cyclase 


PFAM 


1-368 ; 
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274 


RNA 3 '-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


12-44 


274 


RNA 3 '-terminal phosphate cyclase proteins. 


BLOCKSPLUS 


157-168 


275 


Ribosomal L27 protein 


PFAM 


31-86 


277 


Cell attachment sequence 


PROSITE 


292-294 


277 


DHHC zinc finger domain 


PFAM 


140-204 


279 


Endogenous opioids neuropeptides precursors 
signature 


PROSITE 


26-65 


279 


Vertebrate endogenous opioids neurope 


PFAM 


3-257 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


100-126 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


209-237 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


43-66 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


18-38 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


24-36 


279 


Endogenous opioids neuropeptides precursors 
proteins. 


BLOCKSPLUS 


105-125 


280 


Leucine zipper pattern 


PROSITE 


136-157 


280 


Leucine zipper pattern 


PROSITE 


272-293 


283 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


380-386 


283 


Immunoglobulin domain 


PFAM 


205-285;318- 
384 


283 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


284 


Fucosyl transferase 


PFAM 


70-406 


285 


FAD/NAD-binding Cytochrome reductase 


PFAM 


27-149 ! 


285 


Oxidoreductase FAD/NAD-binding domain 


PFAM 


176-290 i 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


58-86 I 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


75-86 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


274-283 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


141-156 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


274-286 


285 


Eukaryotic molybdopterin oxidoreductases 
proteins. 


BLOCKSPLUS 


60-85 [ 


285 


CYTOCHROME B5 REDUCTASE 
SIGNATURE 


BLOCKSPLUS 


181-198 


285 


FLAVOPROTEIN PYRIDINE NUCLEOTIDE 
CYTOCHROME REDUCTASE SIGNATURE 


BLOCKSPLUS 


181-197 
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286 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROS1TE 


380-386 


286 


Immunoglobulin domain 


PFAM 


205-285;318- 
384 


286 


Immunoglobulins and major histocompatibility 
complex proteins. 


BLOCKSPLUS 


319-336 


287 | 


Leucine zipper pattern 


PROSITE 


126-147 


288 


Leucine zipper pattern 


PROSITE 


20-41 


291 


Tissue inhibitors of metalloproteinases 
signature 


PROSITE 


24-36 


! 291 


Tissue inhibitor of metalloproteinases 


PFAM 


22-199 


! 291 i 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


21-46 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


106-148 


291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


81-95 


1 291 


Tissue inhibitors of metalloproteinases proteins. 


BLOCKSPLUS 


61-72 


294 


Domain of unknown function DUF59 


PFAM 


31-135 


296 


Immunoglobulin domain 


PFAM 


141-197 


297 


TonB-dependent receptor proteins signature 1 


PROSITE 


1-42 


298 


Fibroblast growth factor 


PFAM 


48-129 


299 


BolA-like protein 


PFAM 


39-114 


299 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


BLOCKSPLUS 


68-98 


301 


Cell attachment sequence 


PROSITE 


172-174 


303 


Ribosomal L27 protein 


PFAM 


31-115 


304 


Leucine rich repeat C-terminal domain 


PFAM 


173-222 


304 


Leucine Rich Repeat 


PFAM 


92-115;l 16- 

139;140- 
163;164-185 


309 


Leucine rich repeat C-terminal domain 


PFAM 


173-222 




L/CUtinc ixicn ivcpcdi 


PFAM 


92-1 1 S1 16- 

139; 140- 
163;164-185 


311 


NOLl/NOP2/sun family 


PFAM 


201-276;353- 
378 


311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


311 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


312 


NOLl/NOP2/sun family 


PFAM 


201-276 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


230-245 


312 


NOLl/NOP2/sun family proteins. 


BLOCKSPLUS 


231-245 


314 


Leucine zipper pattern 


PROSITE 


8-29 


315 


Leucine zipper pattern 


PROSITE 


8-29 


341 


Immunoglobulin domain 


PFAM 


45-112 


349 


CDP-alcohol phosphatidyltransferases signature 


PROSITE 


54-76 


349 


Cytochrome b/b6 Qo site signature 


PROSITE 


97-102 


354 


SAM domain (Sterile alpha motif) 


PFAM 


82-147 


361 


Ribosomal Proteins L2 


j PFAM 


96-124 


368 


DAD family 


PFAM 


1-78 


370 


Ribosomal protein L34 


PFAM 


51-92 
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*1Q< 


js^eicn mom 


PFAM 
rrnjvi 


114;116- 
1 62; 164- 
209;211- 
265;270-316 


386 


SPRY domain 


PFAM 


85-205 


388 


PHD-finger. 


BLOCKSPLUS 


329-339 


389 


Eukaryotic thiol (cysteine) proteases histidine 
active site 


PROSITE 


268-278 


389 


Heat shock hsp70 proteins family signature 3 


PROSITE 


332-346 


389 ! 


Hsp70 protein 


PFAM 


3-509 


390 


Eukaryotic-type carbonic anhydrase 


PFAM 


20-59 


391 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-162 


392 


Seel family. 


BLOCKSPLUS 


89-107 


393 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


394 


Myc-type, 'helix-loop-helix' dimerization 
domain signature 


PROSITE 


13-28 


i 395 


Glutathione S-transferases. 


PFAM 


47-122;260-309 


! 396 


Transmembrane 4 family signature 


PROSITE 


112-134 


1 396 


Transmembrane 4 family 


PFAM 


66-273 


396 


Transmembrane 4 family proteins. 


BLOCKSPLUS 


108-146 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


129-151 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


108-127 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


247-274 


396 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


BLOCKSPLUS 


129-150 


396 


TRANSMEMBRANE FOUR FAMILY 

O T / — ' X. T A '1*1 II) 

SIGNATURE 


BLOCKSPLUS 


128-154 


397 


ATP/GTP-binding site motif A (P-loop) 


PROSITE 


6-13 


397 


ADP-ribosylation factor family 


PFAM 


2-172 


398 


Isochorismatase family 


PFAM 


17-147 


399 


PAP2 superfamily 


PFAM 


19-175 


400 


Zinc carboxypeptidases, zinc-binding region 2 
signature 


PROSITE 


1 17-127 


! 401 


Zinc finger, C2H2 type, domain 


PROSITE 


36-57 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


73-93 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


114-134 


401 


Zinc finger, C2H2 type, domain 


PROSITE 


145-165 


401 


Zinc finger, C2H2 type 


PFAM 


34-57;71- 
93;112-134;143- 
165 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


145-162 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


114-131 


401 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


73-90 


402 


Zinc finger, C2H2 type, domain 


PROSITE 


113-133 
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402 


Zinc finger, C2H2 type, domain 


PROS1TE 


144-164 


402 


Regulator of chromosome condensation 
(RCC1) signature 2 


PROS1TE 


65-75 


402 


Zinc finger, C2H2 type 


PFAM 


111-133;142- 
164 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


144-161 


402 


Zinc finger, C2H2 type, domain proteins. 


BLOCKSPLUS 


113-130 


403 


Glutathione S-transferases. 


PFAM 


47-122;260-309 


405 


PMP-22/EMP/MP20/Claudin family 


PFAM 


4-182 


406 


WD domain, G-beta repeat 


PFAM 


267-304;333- 
370 


408 


Rhomboid family 


PFAM 


186-323 


410 


Ank repeat 


PFAM 


47-79 


410 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


BLOCKSPLUS 


78-89 


410 


Ank repeat proteins. 


BLOCKSPLUS 


48-56 


412 


Serine proteases, subtilase family, aspartic acid 
proteins. 


BLOCKSPLUS 


165-178 


414 


Sir2 family 


PFAM 


84-268 


A 1 £ 

mo 


jveicn mom 


PP AIM 


iU"DO ,\JO- 

114; 116- 
162; 164- 
209;211- 
265;270-316 


418 


Zinc-binding dehydrogenases 


PFAM 


16-313 


426 


Leucine zipper pattern 


PROSITE 


144-165 


| 447 


Cytochrome c family heme-binding site 
signature 


PROSITE 


19-24 


447 


Immunoglobulins and major histocompatibility 
complex proteins signature 


PROSITE 


17-23 


453 


eEF-6 family 


1 PFAM 


3-103 


454 


Cell attachment sequence 


| PROSITE 


226-228 


456 


Leucine zipper pattern 


PROSITE 


211-232 i 


457 


Leucine zipper pattern 


PROSITE 


236-257 


466 


Zinc finger, C3HC4 type (RING finger), 
signature 


PROSITE 


56-65 


466 


SPRY domain 


PFAM 


375-500 ! 


466 


Zinc finger, C3HC4 type (RING finger) 


PFAM 


At O 1 

41-81 


: 466 


B-box zinc finger. 


PFAM 


1 10-153 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


359-381 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


443-457 


466 


Domain in SPla and the RYanodine Receptor. 


BLOCKSPLUS 


I 359-380 


466 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BLOCKSPLUS 


56-65 


A — Jf\ 

479 


UBX domain 


rr AM 




481 


TBC domain 


PFAM 


65-171 


481 


Probable rabGAP domain proteins. 


BLOCKSPLUS 


153-159 


482 


TBC domain 


PFAM 


65-177 [ 



517 



BNSOOCID: <WO_01 42451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



j 482 Probable rabGAP domain proteins. 



BLOCKSPLUS 



153-159 ] 
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i 244 
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250 


28..37;60..67;73..81 


251 


33..45;64..71 


252 


20 30 35 45 49 59*74 83 


253 


3..9;59..65 
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22..33;35..52;53..67;70..77;80..100;106..117;142..147 
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255 


1 16.. 123; 14/.. 1 ->6;zU 1 ..zUo,zoz..z /o 


256 
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1 C "7 

257 
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260 
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261 
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44..50 


264 


51..58;82. .90*153. .164 


966 


15 20-38 49*76 81*95 105 


267 


74 91-94 99 1 17 130 140 154*153 161175 184*201 210*22 
8 240*250 255 


268 


36..42;43..54 


269 


41 46-64 73 80 100- 106 122-160 172 


270 


38 48*82 88 


271 


34..40;72..79; 1 1 1 ..1 23; 146.. 1 53;25 1 ..259;307..3 14;3 1 6..322;37 
2 377-436 444 


272 


12 17*51 58*75 85*128 136 


273 


4 13*56 64 


274 


34..46; 1 20. . 1 27; 1 57.. 1 63; 1 82.. 1 9 1 ;23 1 ..240;259..267;273..279; 
291..299;344..355 


275 


30..55;72..78 


276 


27..35;37..45;49..61;61..77;102..109;144..152;170..180;179..18 
8 


277 


6 1 ..67; 1 47.. 1 52; 1 54. . 1 66;284..299;308..3 1 3 


278 


72..82;451..461;532..541 


279 


24..31;72..84;83..92;97..111;144..149;161..182;181..189;192..1 
98;204..2 14;2 1 6..233;241 ..254;256..263 
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1 280 
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8 16-117 199 ! 


470 
*r z» vy 


27 35*71 83*91 97*137 146 


491 


9 11-77 89 

Z.. .11,/ / ..Ol 


49? 


90 96 61 69 


423 


9..19;25..34;47..60;57..65;87..92;106..116;126..134 


424 


A 1C.01 '2>4--s'*t A9-*7Q RA'ftQ Q«; in8 11"V19R 140 
O.. 1 0,2 1 .J4, ***0..02, /7..00,07..7J, IUO..I 1 J, liO.. 1H*7 


/IOC 


2..1 1 ,2y..4o,4 /..DD 


426 


j..20,JJ..O5 


427 


1 ii. 10 ac-ic ^q-co ^o-oo QS'inn ios-190 1^7-iss 160 


498 


9^ ^1-47 61-65 79-87 94 

1 ,*+ / .,U1,UJ.. /Z,CW . .j*"t 


490 


1 19-^1 10S0 69*82 87 


4^0 


81 91 

O 1 ..-7 1 


4^ 1 


^6 44-8^ 89 1 


i 4^9 


98 49-56 76 1 10 117 

Z.O. .HZ,,JU. . / V», 1 IV/-. 1 1 / 


All 


S 14-41 40 


4^4 


0 17-1S 91-61 70-80 89 


4^S 


1 94-^9 40-59 60 


4^6 


1 11-17 30*90 43 


4^7 


90 30*98 35*40 50 


438 


10..28;76..85;91..99;107..112 


4jy 


*2A 09- 1 Oil 1 1 O 


/I /I A 

44 U 


A0-A*7 *71«106 191 
jD..42,4 /..-> /,D3.. / 1 , 1 "JO.. J 2 1 j 


44 1 


1Q 9*\-97 44 
1 :-*..ZO,2 / -.HH 


442 


1 14*91 9Q16 49-44 S1-69 81-101 100*111 119*138 149 


443 


10 18*25 31 33 40 51 70*89 94 1 


444 


3 819 26*32 44 


445 


1..1 1;19..38;38.,49;52. .60*130.. 139 ! 


446 


12..20;28..37;43..66;90..102 


447 

HH / 


IS 90-94 ^1-^6 47*68 89 88 96 


1 448 


90 4S-8^ 01-88 04*139 144 1 


A AO 


99 ^VS4 64-86 06-109 108 


450 


27..39;47..60;101..107;155..164;270..281;287..300;306..312;32 
7..332 


451 


13..24;60..70;77..83 


452 


8..14;74..82 


453 


6..12;63..78;77..91;97..103;102..108 
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454 


32..44;66..72;101..114;166..174;209..235;243..252;258..263 


r~ 455 


69..76;l3l..l39;l64..l73 


! 456 


54..63;95..103;187..202;21 1 ..216;249..261 


! 457 


14. .21 ;3 1 ..45;80..88; 1 87.. 1 94;347..353 


458 


47..62;79..86 


459 


1..8;27..37;90..97;99..106;123..140;145..163 


1 460 


8..17;35..45;131..139;162..169;175..180 


[ 461 


1..6;13..23;58..66;89..101 


{ 462 


44..53;86..93 


463 


62..70 


464 


50..57;59..69;67..73;79..9S 


465 


10..17;23..44 


466 


3..15;71..78;110..121;125..131;259..269;296..306;312..318;340 
..346;353..363;370..379;407..412;417..425;448..453;483..493 


467 


5..12;2O..30;7O..78;82..100;106..115;129..135 


468 


8..16;22..31;36..45;75..84 


469 


14..23;98..105;106..116 


i 470 


1 1 ..23;26..3 1 ;54..62; 1 01 .. 1 07 


471 


23..29;66..81 


472 


23..29;93..100 


473 


8..25;79..89;103..109 


474 


37..45;80..89;94..101;125..130 


475 


37..45;80..89;94..101;125..130 


! 476 


7..26;23..36;36..45;78..83;80..85 


477 


45..53 


478 


1..7;16..22;78..93;96..102 


479 


24..33;41..50;61..80;93..100;129..136;160..170;199..208;267..2 
76;325..335 


480 


5..14;43..51;102..116 j 


481 


2..15;16..24;53..62;87..97;100..109;109..133;145..152 


482 


2..15;16..24;53..62;87..97;100..109;109..133;145..152;168..176 
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Table VIII 



Seq Id No 


Chromosomal location 


2 


16p11-p13 | 


12 


16p11-p13 


22 


7q35-q36 


25 


chr.19 


34 


chr.17 | 


| 35 


6p21.3 


40 


chr.20 


I 42 


12p13.3 


45 


12p13.3 S 


51 


12p 


56 


22q11.2-q13.2 


57 


12p13 


60 


chr.10 


62 


chr.17 I 


65 


Xq13 ! 


67 


chr.14 


I 70 


chr.7(1 );7q1 1 .23-q21 .1(1 ) 


71 


chr.7(1 );7q 1 1 .23-q21 . 1 (1 ) 


73 


6p21.3 


74 


6p21.3 


87 


19q13.1 


88 


7q21-q22 


94 


17q11.2 


99 


L 6t l 21 


101 


6p11.2-p21.3 


103 


chr.17 


106 


6q15-q16.3 


107 


! 16p13.3 


! 108 


! 12q 


113 


1 p33-p34.3 


125 


6p22.1-p22.3 


126 


16p13.3 


127 


14q11.2 


135 


22q11.2-q13.2 I 


138 


chr.3 | 


141 


12q24.1 I 


146 


| 3p21.3 


147 


chr.2 


149 


chr.17 


150 


21q 


152 


21q 


154 


20q12-q13.11 ! 


155 


11p15.5 


! 160 


19q13.2 


; 161 


19q13.2 


162 


20q12-q13.11 


164 


21q 


166 


21q 



•9 
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170 


6p12.1-p21.1 


r 172 


21q 


j 173 


chr.19 


176 


21q 


| 177 


21q 


179 


chr.6 


183 


chr.7 


185 


Xq21.3-q22.3 


186 


chr.20 


192 


11q12.2 


195 


chr.20 


196 


20q13.1-q13.2 j 


197 


7p15-p21 


198 


19q13.3 


199 


chr.2 


201 


Xq22.1-q23 


202 


Xq22.1-q23 


204 


chr.20 


205 


chr.5 


206 


chr.2 


I 208 


chr.5 


214 


chr.12 


220 


Xq28 


224 


chr.7 


227 


chr.14 


230 


chr.7 


231 


chr.7 


238 


19p13.3 
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Table IX 



PCT/IB00/01938 



Sea Id No 


Tissue distribution 


1 


Br:28;FB:25;FK:9;Ov: 1 7;P1: 1 2;Pr:4;SC:2;SG:4;Te:9 


2 


Br:2;CP: 1 ;FB:5;FK: 1 ;Pl:3;Pr: 1 0;SG: 1 


3 


Br:l;CP:l;FB:33;FK:13;Li:2;Ov:19;PG:12;Pl:27;Pr:15;SG:9;SI:12 


4 


AG: 1 ;CP: 1 ;LG: 1 ;Pr:3;Te: 1 


5 


Pa:4;Pr:2 


6 


Li:l;Pa:4;Pr:3 


7 


Br:9;Pr:l;Te:3 


8 


Br:4;FB:l;Pr:3;SG:8 


9 


Br:4;Ce: 1 ;Co: 1 ;DM:4;FB:33;FK: 1 6;He:3;Ki:6;LC:2;LG:4;Li:2;Lu:2;Ly: 1 ;Ov:36;Pa 
: 1 6;P1 :2;ft-:4;SC:2;SI: 1 ;SN: 1 ;Sp: 1 ;UC:3;Ut: 1 


10 


Br:l;CP:l;Pr:4;SG:2 


11 


Pr:2;SG:4 


12 


Br: 1 ;CP: 1 ;FB:5;FK: 1 ;Pl:3;Pr:9;SG: 1 


13 


FL:4;Li:4 


14 


Li:4;Te:3 


15 


Te:l 


16 


Li'3Te*6 


17 


Ce:l;FB:6;Li:l;Pl:5;Te:16 


18 


Li:7;Te:6 


19 


Lr27Te"9 


20 


Li:l;Te:3 


21 


Te:3 


22 


Te:3 


23 


Li:l;Te:6 


24 


Li:2;Te:2 


25 


Te:8 


26 


Te:5 


27 


LC:l;Te:2 


28 


Li:l;Te:2 


i 29 


AG:2;BM: 1 ;Br: 1 6;CP: 1 ;Co:2;DM: 1 ;FB:45;FK:62;FL: 1 ;HP:3;LC: 1 ;Li:2;Mu: 1 ;Ov:2 
;Pr:10;SI:5;SN:3;Te:9;UC:l 1 


30 


Li:2 


31 


Br:3;CP: 1 ;FB: 1 ;FK:6;Pr: 1 ;Te:2 ! 

7 7 7 7 7 


32 


Br:l;CP:l;Ce:6;Ov:l;Te:2 S 


33 


FK:5;SC:1 I 


34 


Br: 1 ;FB:2;FK:48;P1:2;SN: 1 


35 


Te:l 


36 


FB:5;Pr:l;SN:l 


37 


FB:3;FK:l;Li:l;SG:5 


38 


FB:10 


39 


FB:3 


40 


Br: 1 ;DM: 1 ;FL: 1 ;P1:4;SG: 1 3 


41 


FB:3;FK:l;Li:l;SG:5 
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42 


BM:1;SG:19 


43 


SG:l 


44 


CP: 1 ;FB: 1 ;Mu:2;Pl:9;b(j:7 


45 


BM:1;SG:20 


46 


BM: 1 ;DM:1 ;FB:5;FK:6;FL:1 ;He: 1 ;Ki:2;Ov:9;Pl: 1 ;SG: 1 ;SI: 1 ; 1 e: 1 


47 


Br:4;FB:4;Pr:3;SG:8 


48 


Br: 12;Ce: 1 ;Co: 1 ;FB:5;FK:4;FL:5;HP: 1 ;Ki: 1 ;LC: 1 ;Li:6;Ov:8;Pl: 105;SC: 1 ;SG:8;Te: 

A 
H 


49 


Br:7;Ce: 1 ;Co: 1 ;FK:4;HP: 1 ;Ki: 1 ;LC: 1 ;Li:5;Ov:8;Pl:5;SC: 1 ;Te: 1 


50 


AG: 1 ;CP:4;Ce: 1 ;DM:2;FB:6;FK:4;FL:2;HP:2;LC: 1 ;LG:3;Li:3 1 ;Lu:3;Mu: 1 ;Ov:25;P 


51 


FL:1 


52 


Br:2;CP:l;FB:3;FK:l;FL:5;LC:2;Pl:l;Pr:2;UC:2 


53 


Br:3;FK:4;FL:4;HP: 1 ;Li:3;Pl: 1 1 ;SG: 1 ;Te: 1 


54 


Br: 1 5;Ce: 1 ;FB: 1 0;FK: 1 0;FL: 1 ;He: 1 ;Ki:6;LC: 1 ;Li:4;Ov:32;Pa:3;Pl:2;Pr:4;SC: 1 ;SN: 

Z,op.H,lc.o,UL.l,Ul.l 


55 


FL:2 


56 


Br:l;FB:l;FL:l;Te:l 


57 


FL:4 


58 


FL:l;Li:l 


59 


FB:3;FK:l;Li:l;SG:5 


60 


Br:l;FB:l;FL:l;Pr:2 


61 


Br:2;Pl:l 


62 


Br:6;CP: 1 ;Ce:7;FB:37;FK:4;FL: 1 ;Pl:6;Pr: 1 ;SG:3;SN:3;Te: 1 ;UC: 1 


63 


Br: 10 


64 


Br:2;CP:2 


65 


Br:l;FB:ll;LG:l;Th:l 


66 


Br:30;Ce: 1 ;Co: 1 ;FB:60;FK: 1 5;FL:3;HP: 1 ;Ki: 1 ;LC: 1 ;Li:6;Ov:57;PG:9;Pl: 145;Pr:2 1 ; 


67 


Br:4;CP: l ;FB: l4;Ki: l ;Li: l ;Lu:2;Pr: l ;Te: l 


68 


Br:l0 


i 69 


AG: l ;Br:48;FB:3;FK:5;HP: 1 ;He: 1 ;Li: 1 ;P1: 1 1 ;SC:2;SG: 1 ;Te:2;Ut: 1 


70 


T"» 1 1 T — vTL M 1 IT 1 

Br:l l;DM:l;He:l 


71 


DM:l;He:l 


72 


Br:9;Pr:l;Te:2 


73 


Br:8;Pr:l 


74 


Br: 5 


76 


A 1 T""fc A t\ T^T* A r 1 !/ . XT . T TT*. 1 .TT 1.T .1 - T " . 1 .Til .11. O/^. 1 . O . 1 . np„ . O . 1 T*. . 1 

AG:l;Br:49;FB:4;FK:5;HP:l;He:l;LC:l ;Li:l;PI:l l;bC:2;bG:l; le:2;Ut:l 


77 


Br:2;FK:2;HP: 1 ;LC: 1 ;Li:2;Ov: 14;P1: 1 ;Pr: 1 4;Te:5 


78 


Br:9;Ce: 1 ;DM: 1 ;FB:21 ;FK: 1 8;FL: 1 ;HP: 1 ;He: 1 ;Ki:9;LC:2;LG:4;Li:2;Lu:2;Ov:34;Pl 

•I'Pr'/t'Qr- 1 •QT-0«Q'M«9-Qr\' 1 -T Tt- 1 
. j,Jrr.H,ov^. 1 ,ol.Z,olN.Z,op. ljUl.l 


79 


Pr:l 


OA 


Br:y;Cr:^;Co: l;L>M:o;rr>. l;rJs..o,ne.z,K.i.4,n^. i;laj. /,c>v.4U,ra. I ,rl.z,rr. I , 
SN:2;Sp: 1 ;Te: 1 2;UC: 1 ;Ut:3 


81 


FK:l;Te:l 


82 


Li:l 1 
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83 


Br:2;CP:l;FB:10;FK:2;Ki:3;Li:7;Ov:10;SC:l;SN:l;Te:l;UC:l 


84 


Br: 5 ;FB : 1 4;FK :9;Li : 6;Ov: 1 7 ; SG : 8;Te: 8 


85 


Li:6;Te:2 


86 


Li:2;Te:2 


87 


Br: 1 ;FB:35;FK:3 1 ;Li:20;Ov:37;PG:5;Pl:69;SI:5;Te:5 


88 


Li:l;Pr:l;Te:7;Ut:2 j 


89 


Te:l 


90 


Te:2 


91 


FB: 1 5;FK:3;Li:2;Ov: 1 7;Pr:4;SG:7;Te:4 


1 92 


Te:2 


93 


Br:4;FB:l;SN:l;Te:2 


94 


Te:l 


95 


Li:2 


96 


AG:l;Br:l;FB:l 


97 


FK:5;Te:2 


98 


Te:3 


99 


Br:3;FB:29;FK: 1 ;Li: 1 0;Ov: 1 ;P1: 16;Pr:2;SG: 1 ;Te:49 


100 


Br:2;FB:3;FK: 1 ;Ov:3;Te: 1 


101 


Br: 1 0;FB:34;FK: 1 ;Ov: 1 ;Pl:85;Pr: 1 ;Ut: 1 


102 


FB:6 


103 


FB:6;Li:3;Pl: 1 ;Pr: 1 ;SG: 1 ;Te:7 


1 104 


Br:26;CP: 1 ;FB:8;FK: 1 3 1 ;Pl:20;Pr:20 


105 


Br:3;CP:2;DM:2;FB:ll;FK:3;LG:2;Ov:l;Pl:6;SC:2;SG:l;SN:4 


106 


FB:4 j 


107 


Br:3;FB:50;FK:59;FL:3;Pr: 1 


i 108 


Br:l;FB:8;Li:l;Lu:l 


109 


FB:ll;Pr:21 


110 


Br: 14;Ce: 1 ;FB:5;FK:5;FL: 1 ;HP: 1 ;He:2;Ki:3;LC: 1 ;Li:4;Lu: l;Ov:7;Pl:2;Pr: 1 ;SI: 1 ;Sp 
:1;UC:1 


1 111 


Br: 1 ;Ce: 1 ;FK: 1 ;HP: 1 ;He:2;Ki:3;LC: 1 ;Li:3;Lu: 1 ;Ov:7;Sp: 1 ;UC: 1 


112 


Br:l;HP:l;Lu:l;Pr:l;SG:9;Ut:l | 


113 


HP:1;SG:4 


114 


FK:9 


115 


AG: 1 ;Br:3;CP: 1 ;FB: 14;FK: 1 9;FL: 1 ;HP: 1 ;Pr: 1 ;SG: 1 


116 


Br:5;CP: l;Ce: 1 ;Co: 1 ;DM:5;FK:3;FL: 1 ;LC:3;LG:1 ;Lu: 1 ;Ov:23;Pl: 1 ;Te:8;UC:2;Ut:4 


117 


Br: 1 ;Ce: 1 ;FB: 1 ;FK: 1 ;FL:2;P1:3;SN: 1 ;Te: 1 ;UC: 1 


118 


CP: 1 ;DM: 1 ;FB:5;FK:2;FL:2;He: 1 ;Lu: 1 ;Ly: 1 ;Ov:23;Pr: 1 ;SN:2;Sp:2;Ut: 1 


119 


Li:2;Te:7 


120 


Br:6;Co:2;FB: 1 ;FK:6;FL: 1 ;Ov:3;Pl:32;Pr:l ;SN: 1 


121 


AG:l;Br:4 


122 


Br: 5 


123 


Br:l 


124 


Br:2;Ki:2;Li:l;Ov:7;UC:l 


125 


Br:2;FB: 1 ;FL:6;He: 1 ;Li: 1 ;Ov: 1 ;Pl:2;Pr: 1 0;Te: 1 ;Th: 1 


126 


Br:l 
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127 


BM: 1 ;Br:2;CP:2;FB: 1 ;FK:3;HP: 1 ;He: 1 ;LG: 1 ;P1: 1 ;Pr: 1 ;SC:2;SG:2;Te:5;Ut:3 j 


128 


Br:l 


129 


Br:2;FB:6;Li:l;SG:3;Te:2 


130 


Br:25;FB:3;FL:2 


1 131 


Br:l 


132 


Br:l i 


I 133 


Br:l 1 


i 134 


Br:2;SN:l 


135 


Br:l 


136 


AG:l;Br:l;FL:l 


137 


Br: 1 ;Ce: 1 ;FB: 1 ;FK: 1 ;FL:2;P1:3;SN: 1 ;Te: 1 ;UC: 1 


i 138 


Br:43 


139 


Br: 1 1 ;CP:2;Co: 1 ;DM:6;FB: 1 ;FK:6;He:2;Ki:4;LC: 1 ;LG: 1 ;Ov:40;Pa: 1 ;Pl:2;Pr: 1 ;SN: 
2;Sp:l;Te:9;UC:l;Ut:3 


140 


Br:23;Ce: 1 ;DM:3;FB:38;FK: 1 7;FL:2;HP: 1 ;He: 1 ;Ki:8;LC:3;LG:2;Li:6;Lu: 1 ;Ly: 1 ;0 
v:40;Pr:4;SC:2;SN:4;Sp: 1 ;Te:5;UC: 1 ;Ut: 1 


141 


Br:39;FB:3;SN:2 


142 


Br:10;SN:2 


143 


Br:26;FK:2;HP: 1 ;LC: 1 ;Li:2;Ov: 1 4;Pl:3;Pr:3;Te:5 


144 


Br:14;Pr:2 


145 


FB: 1 2;LG: 1 ;Pr:4;Te: 1 ;Ut:2 


146 


Li:l;Ov:2;Pr:5;SG:ll j 


147 


Li:l;Te:l 


148 


Br:l;FB:l;Li:l;Te:l 


! 149 


Br:3;FB:5;FK:5;Li: 1 ;Pl:8;Te:5 


I 150 


FK:6;Pr:2;SG:8 


151 


FK:9 


152 


FK:6;Pr:2;SG:9 j 


153 


Te:l 


154 


FB:28;Ov:4 


155 


Br:21;Ce:l;FB:32;FK:4 


156 


Br:5;CP: 1 ;FB: 1 6;FK:3;He: 1 ;Ki:5;Li: 1 ;Ov: 1 5;P1:3;SG:2;SI: 1 ;Sp: 1 ;UC: 1 j 


157 


FB:14;FK:1;FL:1;SG:1 


158 


FB:7 


159 


FB:10 


160 


Ce:2;FB:12 


161 


Ce:2 


162 


FB:28;Ov:2 


163 


FB:14;FK:1;FL:1;SG:1 


164 


FK:4;Pr:l;SG:9 [ 


165 


Br:4;Co: 1 ;Ki: 1 ;Ov:2;Pr:4;SG: 1 


166 


FK:6;Pr:2;SG:9 | 


167 


Br:l;FB:l;SG:5 


168 


Br: 1 ;FB:5;FK:7;SG: 1;UC: 1 


169 


FK:2 


\ 170 


FL:12 
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171 


Br:2;FB:l;FK:l;Pl:7 


172 


Br:106;FB:2;Pl:7 


173 


Br:14;FB:l;Pl:2;Te:l 


174 


Br: 1 7;He: 1 ;P1 : 1 ;SC:2;Te: 1 


175 


Br:14;Pr:2 j 


176 


Br:106;FB:2;Pl:7 ] 


177 


Br:l 14;FB:7;FK:7;Ov:2;Pl:7;Pr:2;Te:9 \ 


178 


Br: 1 6;CP:2;FB:2;FK:2;FL: 1 ;Li: 1 ;P1: 1 3;Pr:3;SC: 1 ;Ut: 1 


179 


FL:l;HP:2;Pr:2;Te:l 


180 


Pr:2 


181 


FB:l;Ov:2;Pr:l;UC:l 


182 


BM: 1 ;Br:4;DM: 1 ;FB:6;FK:6;Ki:5;LC:2;LG: 1 ;Li: 1 ;Lu: 1 ;Ov: 1 5;P1: 1 ;Pr:2;SC: 1 ;Sp:2; 
Te:2;Ut:l 


183 


Br:8;CP: 1 ;Co:2;DM:4;FB: 1 ;FK: 1 ;Ki:4;LC: 1 ;Li :3;Ov:33;Pl: 1 ;Pr:5;SC:2;SN: 1 ;Sp: 1 ; 
Te:5;UC:l;Ut:2 


184 ! 


Pr:l 


185 


FB:2;Li:l;Ov:l;SG:7;Te:5 


186 


Te:3 


187 


Te:l 


188 


Br: 1 8;CP: 1 ;DM:5;FB:40;FK:23;FL:2;He:3;Ki: 10;LC:2;LG:1 ;Li: 13;Lu:3;Ly:2;Mu: 1 
;Ov:54;Pl:5;Pr:14;SC:2;SG:2;SI:2;SN:4;Sp:3;Te:4;UC:4 


189 


Li:l;Te:l 


190 


Br:7;CP: 1 ;FB: 1 ;FK:4;FL:5;He: 1 ;Li: 1 ;Ov: 1 ;Pl:2;Pr:4;SG: 1 


191 


Li:2;Te:4 


192 


AG:l;Br:2;CP:l;FB:32;FK:l;Li:l;Ov:36;Pl:49;Pr:3;SC:l;SG:4;SN:4;Te:9;UC:l;Ut: 
2 


193 


FB:31;FK:75;FL:7;Ov:12;Pl:23;Pr:8;SG:3;Te:16 


194 


Te:2 


195 


Te:7 


196 


Te:2 


197 


Te:3 


198 


Li:10;Te:43 


199 


Br:35;CP:3;FB:39;FK:56;FL:7;HP: 1 ;LG: 1 ;Li: 1 ;Ly: 1 ;Ov:2;Pl: 10;Pr:8;SG: 1 ;Te:4;Ut: 
2 


200 


FB:17;FK:9;FL:5;Ov:21;Pl:41;Te:3 


201 


FK:16;SI:1 


202 


Br: 1 ;Co: 1 ;FB: 1 1 1 ;FK:25;He: 1 ;Li:4;Ov:3;Pr:6;Te: 1 


204 


Te:7 


205 


Li:7;Te:28 


206 


FB:28;Li:2;Ov:23;PG: 1 1 ;P1:45;SG: 1 7;SI: 1 1 ;Te:9 


! 207 


FB: 1 6;FK: 1 ;Ov: 1 ;SC: 1 ;Te: 1 


i 208 


FB:5 


209 


FB:6 


210 


Br:l;FB:22 


211 


Br:2;Ce:3;FB:6;FK:l 


212 


Br:l;Co:2;FB:22;FK:2;LG:2;Mu:2;Pl:2;SG:4 
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213 


Br:2;DM: 1 ;FB:8;FK:8;FL: 1 ;Ki: 1 ;LG:3;Ov:5;Pa: 1 ;Pl:4;Pr: 1 ;SN:2;UC: 1 


214 


FB:7 


215 j 


FB:4 


216 


Ov:3;SG:3 


217 


Br:4;CP:2;DM: 1 ;FB:9;FK:3;Ki:2;LC: 1 ;LG: l;Lu:3;Ly:l ;Ov: 14;P1: 1 ;Pr:l ;SC: 1 ;SG:2 
;Sp:l;Te:l;Ut:l 


218 


FB:4;FK:2;Pl:l;Pr:ll;SG:l 


219 


Br:7;CP:3;FB:2;FL: l ;HP:4;Lu: l ;Ly:2;Mu: l ;Ov:3;Pl: l ;Pr: l ;SN:2;Te: l 


220 


Br:l;FL:l;Pl:2 


221 


Co:l;FB:2;FL:l;Li:l;Pl:2 


222 


FL:l;SG:2 


223 


Li:l;Te:l 


225 


Li:l0 


226 


Li:l;Te:4 


227 


Li:l 


228 


Br:l 


229 


Br:3 


230 


Br:5;Ce: l ;Co: l ;DM:3;FB: l ;FK: l ;He: l ;LC: l ;LG:2;Ov: 1 6;Pl:3;Pr: l ;Te:2;Ut: l j 


231 


Br:3;Ce: l ;Co: l ;DM:3;FB: l ;FK: l ;He: l ;LC: l ;LG:2;Ov: 1 6;Pl:3;Pr: l ;Te:2;Ut: l 


232 


AG: l ;Br: 1 7;CP:2;DM: l;FB:5 1 ;FK:9;FL:3;Li:3;Ov:3;Pl:2;Pr: lO;SC:l ;SG:5;Te:2;Ut: 
l 


233 


Br: 13 


234 


Br: 5 


235 


Br:l;Pl:l 


236 


Br: 9 


237 


Br:22;DM:2;FB: 1 7;FK:9;Ki:4;LG: 1 ;Li: 1 ;Lu:2;Ov:24;Pr:3;SC: 1 ;SI: 1 ;SN:2;Te:2 


238 


Br: 17 


239 


Br: 11 


240 


Br:28;Ce: 1 ;DM:5;FB:52;FK:40;FL:2;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li:l ;Ly: 1 ;Ov:28; 
PI: 1 ;Pr:5 ;SC: 1 ;SI: 1 ;SN:3 ;Sp:6;Te: 1 ;UC: 1 ;Ut: 1 


\ 241 


Br:4;Ce: 1 ;DM:5;FB:5;FK:7;HP: 1 ;He:2;Ki:3;LC: 1 ;LG:3;Li: 1 ;Ly: 1 ;Ov:28;Pl: 1 ;SC: 1 ; 
SN:3;Sp:6;Te:l;UC:l;Ut:l 
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Table X 



Seq Id No 


Low frequency 
expression 


High frequency 
expression 


1 


- 


Br.Ov 


2 


! 


Pr 


3 


Br,Te 


Ov,PG,Pl,SI 


4 




AG 


5 


_ 


Pa 


6 


_ 


Pa 


7 




Br 


8 


_ 


SG 


9 


Br,Te 


DM,He,Ki,Ov,Pa j 


10 


_ 


Pr 


11 


_ 


SG 


12 




Pr 


13 


_ 


FL.Li 


14 




Li.Te 


15 




Te 


16 




Li,Te 


17 




Te 


18 


— 


Li.Te 


19 




Li,Te 


20 




Te 


21 




Te 


22 




Te 


23 




Te j 


24 


_ 


Li \ 


25 


_ 


Te 


I 26 


_ 


Te 


27 


_ 


LC.Te 


28 


_ 


Te 


29 


PI 


FK 


30 




Li 


31 


_ 


FK 


32 




Ce 


33 


_ 


FK,SC 


34 


FB 


FK 


35 


_ 


Te 


36 


- 


SN 


37 




SG 


38 




FB 


40 




SG 


41 




SG 


42 




BM.SG 


43 




SG 
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44 




Mu,Pl,SG 


45 


- 


BM,SG 


46 


- 


BM,Ki,Ov 


47 


- 


SG 


48 


FB,FK,Pr 


PI 


49 


- 


Ki,Ov 


50 


Br,FB,FK,SG 


Li,Ov,Te 


51 


- 


FL 


52 


- 


FL,LC,UC 


53 


- 


PI 


54 


- 


Ki,Ov,Pa,Sp 


55 


- 


FL 


57 


- 


FL 


| 58 


- 


FL 


59 


- 


SG 


62 




Ce,FB 


63 


- 


Br 


64 


- 


CP 


65 


- 


FB,Th 


66 


FK,SG,Te 


Ov,PG,Pl 


67 


- 


FB,Ki,Lu 


68 


- 


| Br 


69 


FB 


Br 


70 


- 


Br,DM,He 


71 


- 


DM,He 


72 


- 


Br 


73 


- 


Br 


74 


- 


Br 


75 


- 


Br 


76 


FB 


Br 


77 


FB 


Ov,Pr 


78 


- 


Ki,Ov 


80 


FB 


DM,Ki,Ov 


82 


- 


Li 


83 


- 


Ki,Li,Ov 


84 


- 


Ov 


85 


- 


Li 


86 


- 


Li [ 


87 


Br,Pr,SG 


Ov 5 PG,Pl 


; 88 


- 


! Te,Ut 


89 


- 


Te 


90 


- 


Te 


91 




Ov 


92 




Te 


93 




SN 


94 




Te 
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95 




T 1 

Li 


96 




AG 


97 




FK 


98 




Te 


99 


FK 


le 


100 




Ov 


101 


FK 


T>1 
PI 


102 




FB 


103 




Te 


104 


FB,Li,SG,Te 


FK 


105 




DM,SN 


106 


■ 


FB 


107 


Br,Pl 


FB,FK 


108 




FB,Lu 


109 




Pr 


110 




He,Ki,Ov 


111 




Ce,He,Ki,Lu,Ov 


112 




Lu,SG 


113 




HP,SG 


114 




FK 


115 




FK 


116 


FB 


DM,LC,Ov,Ut 


117 




Ce,UC 


118 


• 


Ov,Sp 


119 




Te 1 


120 


FB 


Co,Pl 


121 


■ 


AG,Br 


122 




Br 


124 




Ki,Ov 


125 




FL,Pr,Th 


127 


- 


BM,SC,Ut 


130 


■ 


Br 


134 


- 


SN 


136 


■ 


AG 


137 


■ 


Ce,UC 


138 


FB 


Br 


139 


FB 


DM,Ki,Ov,Ut j 


140 


PI 


Ki,Ov 


141 




Br 


142 




Br,SN 


143 


FB 


Br,Ov 


144 




Br 


145 




rB.Ut 


146 




SG 


149 




PI 


' 150 




FK,SG j 
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151 




FK 


152 




FK,SG 


153 


■ 


Te 


154 




FB,Ov 


155 




Br,FB 


156 




Ki,Ov 


157 


■ 


FB 


158 


• 


FB 


159 


- 


FB 


160 




Ce,FB 


161 


• 


Ce 


162 


- 


FB 


1 163 


- 


FB 


164 


- 


SG 


165 


- 


Co,Ki,Ov 


166 


- 


FK,SG 


167 




SG 


168 


- 


FK 


169 


- 


FK 


170 


- 


FL 


| 171 


- 


PI 


172 


FB,FK,Pr 


Br 


173 


- 


Br 


174 


- 


Br,He,SC 


175 


- 


Br 


176 


FB,FK,Pr 


Br 


177 


FB 


Br 


178 


- 


Br,Pl 


179 


- 


HP 


180 


- 


Pr 


181 


- 


Ov,UC 


182 


- 


Ki,Ov,Sp 1 


183 


FB 


DM,Ki,Ov 


185 


- 


SG,Te 


186 




Te 


187 


■ 


Te 


188 


PI 


DM,Ki,Ov 


190 




FL 


191 


- 


Te 


192 


Br,FK 


Ov,Pl 


193 


Br 


FK,Ov 


194 


- 


Te 


195 




Te 


196 




Te 


197 




Te 


198 


FB 


Li,Te 
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199 




rK 


200 


Br 


Uv,FI 


201 




rK 


202 




rK 


203 ! 


Br,PJ 




204 




i e 


205 




Li, le 


206 


Br,FK,Pr 


Ov,PG,Pl,SG,SI 


207 




FB 


208 




rB 


209 




FB 


210 




FB 


211 




Ce 


212 




Co,FB,Mu 


213 




Ki,LG,Ov 


214 




FB 


215 




FB 


j 216 




Ov,SG 


217 




Ki,Lu,Ov 


218 




Pr 


219 




CP,HP,Ly,Ov,SN 


221 




Co 


222 




SG 


223 




SG 


225 


** 


Li 


226 




! Te 


227 


— 


Li 


229 




Br 


230 




DM,Ov 


231 




DM,Ov 


232 




FB 


233 




Br 


234 




Br 


236 




Br 


237 




Ki,Lu,Ov 






j_>i 


239 




Br 


240 


Pl,Te 


DM,FK,Ki,Ov,Sp 


241 


FB 


DM,He,Ki,Ov,Sp 



537 



BNSDOCID: <WO_01 42451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



Table XI 



Scq Id No 


Subcellular localization 


7 


nuclear 


13 


extracellular, including cell wall 


20 


mitochondrial 


21 


nuclear 


26 


nuclear 


35 


nuclear 1 


37 


endoplasmic reticulum 


38 


extracellular, including cell wall 


39 


endoplasmic reticulum 


41 


endoplasmic reticulum 


59 


endoplasmic reticulum 


70 


nuclear 


71 


nuclear 


72 


nuclear 


i 78 


nuclear 


; 98 


nuclear 


99 


nuclear 


105 


mitochondrial 


108 


endoplasmic reticulum 


116 


mitochondrial 


117 


mitochondrial 


134 


nuclear 


135 


nuclear 


137 


mitochondrial 


159 


nuclear 


160 


nuclear 


161 


nuclear 


171 


nuclear 


178 


endoplasmic reticulum \ 


[ 182 


nuclear 


184 


nuclear 


185 


endoplasmic reticulum 


186 


nuclear \ 


187 


nuclear 


188 


nuclear 


194 


nuclear I 


195 


nuclear 


196 


nuclear 


200 


mitochondrial 


204 


nuclear 


205 


nuclear 


206 


nuclear 
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211 


nuclear 


212 


nuclear 


213 


nuclear 


214 


endoplasmic reticulum 


1 215 


endoplasmic reticulum 


216 


endoplasmic reticulum 


218 


nuclear 


220 


endoplasmic reticulum 


224 


nuclear 


ZZj 


nucicdi 


230 


mitochondrial 


231 


mitochondrial ; 


238 


cytoplasmic ' 
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Table XII 



deq id fNo in 

priority 
applications 


internal aesignau n 


C 0 A IH 1M in 
oct| lo i^i in 

present 

application 


119 


119-003-4-0-C2-CS 


1 


220 


105-016-1-0-D3-CS 


2 


345 


105-016-3-0-G10-CS 


3 


334 


1 05-026-1 -0-A5-CS 


4 


159 


1 05-03 1-1-0-All-CS 


5 


219 


1 05-03 1-2-0-D3-CS 


6 


250 


105-035-2-0-C6-CS 


7 


217 


105-037-2-0-H11-CS 


8 


340 


105-053-4-0-E8-CS 


9 


1 15 


1 05-074-3 -0-H10-CS 


10 


31 


105-089-3-0-G10-CS 


U 


198 


105-095-2-0-G1 1-CS 


12 


154 


1 06-006- 1-0-E3-CS 


13 


366 


1 06-037- 1 -0-E9-CS .cor 


14 


366 


1 06-037- 1 -0-E9-CS.fr 


15 


79 


106-043-4-0-H3-CS 


16 


95 


1 10-007- 1-0-C7-CS 


17 ! 


364 


1 14-016-1-0-H8-CS 


18 


246 


1 16-004-3-0-A6-CS 


19 


1 87 


1 1 6-054-3-0-E6-CS 


20 


203 


11 6-055-1 -0-A3-CS 


21 


298 


1 16-055-2 -0-F7-CS 


22 


277 


1 16-088-4-0-A9-CS 


23 


! 41 


1 16-091-1-0-D9-CS 


24 


353 


116-11 0-2 -0-F4-CS 


25 


\ 78 


116-111-1-0-H9-CS 


26 


245 


116-111-4-0-B3-CS 


27 


104 


116-115-2-0-F8-CS 


28 


259 


116-119-3-0-H5-CS 


29 


269 


117-001-5-0-G3-CS 


30 


166 


145-25-3-0-B4-CS.cor 


31 [ 


166 


145-25-3-0-B4-CS.fr 


32 


169 


145-56-3-0-D5-CS 


33 


312 


145-59-2-0-A7-CS 


34 


273 


157-15-4-0-B11-CS 


35 


190 


160-103-1-0-Fll-CS 


36 


244 


1 60-37-2 -0-H7-CS 


37 i 


151 


160-58-3-0-H3-CS 


38 


149 


160-75-4-0-A9-CS 


39 


307 


174-10-2-0-F8-CS 


40 


264 


174-33-3-0-F6-CS 


41 
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168 


1 74-38- 1-0-B6-CS 


42 


202 


174-38-3-0-C9-CS 


43 


28 


174-39-2-0-A3-CS 


44 


331 


174-41 -1-0- A6-CS 


45 


258 


174-5-3-0-H7-CS 


46 


84 


174-7-4-0-H1-CS 


47 l 


294 


175-1 -3-0-E5-CS.cor 


48 


294 


175-l-3-0-E5-CS.fr 


49 


310 


180-19-4-0-F4-CS 


50 


311 


181-10-1-0-DlO-CS 


51 


263 


181-16-1-0-G7-CS 


52 


304 


181-1 6-2-0- A7-CS 


53 


109 


181-20-3-0-B5-CS 


54 


121 


181-3-3-0-B8-CS 


55 


181 


181-3-3-0-C9-CS 


56 


191 


182-1-2-0-D12-CS 


57 


193 


184-1-4-0-C11-CS 


58 


192 


184^-1-0-All-CS 


59 


116 


187-12-4-0-A8-CS 


60 


268 


187-2-2-0-A3-CS 


61 


123 


187-31-0-0-fl2-CS 


62 


234 


1 87-34-0-0-1 12-CS 


63 


185 


187-37-0-0-clO-CS 


64 


279 


1 87-38-0-0-11 0-CS 


65 


114 


187-39-0-0-kl2-CS 


66 


211 


187-41-0-0-i21-CS 


67 


236 


188-1 1-1 -0-B3-CS 


68 


35 


188-18-4-0-A9-CS 


69 


299 


188-28-4-0-B12-CS.cor 


70 


299 


188-28-4-0-B12-CS.fr 


71 


72 


188-28-4-0-D4-CS 


72 


242 


1 88-41 -l-0-B8-CS.cor 


73 


242 


188-41-l-0-B8-CS.fr 


74 


173 


1 88-45-1 -0-D9-CS 


75 


106 


188-9-2-0-E1-CS 


76 


130 


105-079-3-0-A11-CS 


77 


323 


105-092-1 -0-H7-CS 


! 78 


160 


105-141 -4-0-H9-CS 


: 79 


272 


109-013-1 -0-B9-CS 


1 80 


226 


110-008-4-0-D9-CS 


81 


333 


114-001-3-0-A2-CS 


82 


315 


1 14-028-2 -0-C1-CS 


83 


300 


1 1 4-032-1 -0-H 1 0-CS 


84 


57 


114-043-2-0-A10-CS 


85 


137 


114-044-1-0-C5-CS 


86 


107 


116-003-3-0-D10-CS 


87 
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164 


116-003-3-0-G12-CS 


88 


108 


1 16-011 -2 -0-F11-CS 


89 


101 


116-033-3-0-E4-CS 


90 


157 


116-041-4-0-B6-CS 


91 


75 


H6-044-2-0-C4-CS 


92 i 


| 322 


11 6-075-1 -0-E6-CS 


93 


124 


116-094-4-0-G5-CS 


94 


289 


117-005-3-0-F2-CS 


95 


122 


121-007-3-0-D9-CS 


96 


208 


145-91 -3-0-D10-CS 


97 


| 282 


157-1 7-1 -0-F4-CS 


98 


129 


160-1 1-3-0-G8-CS 


99 


317 


1 60-24- 1-0-F12-CS 


100 J 


308 


160-24-2-0-E9-CS 


101 


25 


160-25-4-0-D2-CS 


102 


243 


160-31-3-0-A11-CS 


103 


346 


1 60-32-1 -0-F6-CS 


104 ! 


60 


160-37-1-0- A3 -CS 


105 


305 


1 60-40-3 -0-E9-CS 


106 


48 


160-5 8-3 -0-E4-CS 


107 


238 


160-85-3-0-D4-CS 


108 


251 


160-95-3-0-A11-CS 


109 


196 


1 62-1 0-4-0-F9-CS . cor 


110 


196 


162-10-4-0-F9-CS.fr 


111 


347 


174-13-2-0-E4-CS 


112 


77 


1 74-46-2 -0-B11-CS 


113 


188 


1 179-8-2-0-A6-CS 


114 


235 


[ 1 80-22-3 -0-B6-CS 


115 


45 


| 181-13-1-0-F7-CS 


116 


265 


181-15-4-0-F7-CS 


117 


280 


18 1-20-1 -0-G7-CS 


118 


281 


184-15-3-0-D1-CS 


119 


39 


187-12-2-0-G11-CS 


120 


165 


187-2-2-0-A12-CS 


121 


326 


| 187-30-0-0-k23-CS 


122 


330 


187-36-0-0-el9-CS 


123 


368 


187-38-0-0-d22-CS 


124 


71 


187-39-0-0-b9-CS 


125 


224 


187-39-0-0-g6-CS 


126 


90 


1 87-45-0-0-1 18-CS 


127 


216 


187-45-0-0-m21-CS 


128 


83 


187-45-0-0-n8-CS 


129 I 


342 


187-46-0-0-f23-CS 


130 


262 


1 87-5-1 -0-A12-CS 


131 ] 


257 


1 87-5-1 -0-F6-CS 


132 1 


293 


1 87-5-2 -0-B2-CS 


133 
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231 


1 87-5-3-0-D5-CS 


134 


287 


187-51-0-0-f9-CS 


135 


325 


1 87-6-1 -0-B9-CS 


136 


309 


187-6-4-0-C10-CS 


137 


359 


1 88-1 9-2 -0-C8-CS 


138 


68 


1 88-22-4-0-G6-CS 


139 


233 


TOO OO /I /"\ TNI 1 /">0 

188-28-4-0-D1 1-CS 


140 


369 


1 88-29- 1 -0-E 1 0-CS 


1 A 1 

141 


155 


1 88-34-4-0-E5-CS 


142 


327 


1 88-9-3 -0-A5-CS 


143 


283 


1 05-02 1 -3-0-C3 -CS 


144 


| 29 


105-037-4-0-H12-CS 


145 


100 


1 05-073-2 -0-A7-CS 


146 


99 


1 oo oO»^ .4 O 

1 09-002-4-0-C6-CS 


147 


360 


i oo oo^ ^ o a j~-*ct 

109-003-1 -0-G4-CS 


148 


321 


1 16-1 18-4-0-A8-CS 


149 


120 


145-52-2-0-D12-CS 


150 


230 


145-7-2-0-G5-CS 


151 


177 


145-7-3-0-D3-CS 


152 


43 


157-17-2-0-C1-CS 


153 


352 


160-101-3-0-H2-CS 


154 


47 


160-12-1-0-DlO-CS 


155 


195 


160-28-4-0-C4-CS 


156 


344 


160-31 -3 -0-E4-CS 


157 


61 


1 60-40-1 -0-H4-CS 


158 


237 


1 60-54- 1-0-F7-CS 


159 


32 


1 60-88-3-0-A8-CS.cor 


160 


32 


1 60-88-3-0-A8-CS.fr 


161 


97 


160-99-4-0-E4-CS 


162 


249 


161-5-4-0-B6-CS 


163 


218 


174-1 7-1 -0-D6-CS 


164 


266 


174-32-4-0-F8-CS 


j 165 


161 


174-38-4-0-D1 1-CS 


166 


113 


174-8-2-0-C10-CS 


167 


255 


1 79-1 4-2 -0-F 1 1-CS 


168 


24 


179-9-4-0-B8-CS 


1 /TV 

169 


t 128 


i o i t o i o y**/^ y— *o 

1 81-10-1 -0-C9-CS 


1 *7 A. 

170 


58 


1 87-5-3-0-C7-CS 


171 


358 


188-26-4-0-F5-CS 


172 


171 


188-27-3-0-G1-CS 


173 


98 


< O O ^ O ^ O T T t 

1 88-29-2 -0-H 1-CS 


174 


133 


188-31-1-0-E6-CS 


175 


49 


1 1 eo-45- l-0-13i-CS> 


1 /o 


42 


1 88-5-1 -0-H6-CS 


177 


148 


188-9-1-0-ClO-CS 


178 


319 


1 05-01 6-3-0-C5-CS 


179 
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WH AT IS CLAIMED IS; 

r 

*« 

1. An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
encoding: 

5 i) a polypeptide comprising an amino acid sequence having at least 

about 80% identity to any one of the sequences shown as SEQ ID 
NOs:242-482 or any one of the sequences of polypeptides encoded 
by the clone inserts of the deposited clone pool; or 
ii) a biologically active fragment of said polypeptide. 

10 

2. The polynucleotide of claim 1 , wherein said polypeptide comprises any one of the 
sequences shown as SEQ ED NOs:242-482 or any one of the sequences of the polypeptides encoded 
by the clone inserts of the deposited clone pool. 

15 3. The polynucleotide of claim 1 , wherein said polypeptide comprises a signal peptide. 

4. The polynucleotide of claim 1 , wherein said polypeptide is a mature protein. 

5. The polynucleotide of claim 1 , wherein said nucleic acid sequence has at least about 
20 80% identity over at least about 100 contiguous nucleotides to any one of the sequences shown as 

SEQ ED NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

6. The polynucleotide of claim 1 , wherein said polynucleotide hybridizes under 
stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 

25 NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

7. The polynucleotide of claim 5, wherein said nucleic acid sequence comprises any 
one of the sequences shown as SEQ ID NOs: 1-241 or any one the sequences of the clone inserts of 
the deposited clone pool. 

30 

8. The polynucleotide of claim 1 , wherein said polynucleotide is operably linked to a 
promoter. 

9. An expression vector comprising the polynucleotide of claim 8. 

35 

10. A host cell recombinant for the polynucleotide of claim 1 . 
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11. A non-human transgenic animal comprising the host cell of claim 10. 

12. A method of making a GENSET polypeptide, said method comprising 

a) providing a population of host cells comprising the polynucleotide of 
5 claim 8; and 

b) culturing said population of host cells under conditions conducive to the 
production of said polypeptide within said host cells. 

13. The method of claim 12, further comprising purifying said polypeptide from said 
10 population of host cells. 

14. A method of making a GENSET polypeptide, said method comprising 

a) providing a population of cells comprising the polynucleotide of claim 
8; 

15 b) culturing said population of cells under conditions conducive to the 

production of said polypeptide within said cells; and 
c) purifying said polypeptide from said population of cells. 

15. An isolated polynucleotide, said polynucleotide comprising a nucleic acid sequence 
20 having at least about 80% identity over at least about 100 contiguous nucleotides to any one of the 

sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 
deposited clone pool. 

16. The polynucleotide of claim 15, wherein said polynucleotide hybridizes under 
25 stringent conditions to a polynucleotide comprising any one of the sequences shown as SEQ ID 

NOs: 1-241 or any one of the sequences of the clone inserts of the deposited clone pool. 

17. The polynucleotide of claim 15, wherein said polynucleotide comprises any one of 
the sequences shown as SEQ ID NOs: 1-241 or any one of the sequences of the clone inserts of the 

30 deposited clone pool. 

18. A biologically active polypeptide encoded by the polynucleotide of claim 15. 

19. An isolated polypeptide or biologically active fragment thereof, said polypeptide 
35 comprising an amino acid sequence having at least about 80% sequence identity to any one of the 

sequences shown as SEQ ID NOs:242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 
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20. The polypeptide of claim 19, wherein said polypeptide is selectively recognized by 
an antibody raised against an antigenic polypeptide, or an antigenic fragment thereof, said antigenic 
polypeptide comprising any one of the sequences shown as SEQ ID NOs:242-482 or any one of the 

5 sequences of polypeptides encoded by the clone inserts of the deposited clone pool. 

21 . The polypeptide of claim 19, wherein said polypeptide comprises any one of the 
sequences shown as SEQ ED NOs:242-482 or any one of the sequences of polypeptides encoded by 
the clone inserts of the deposited clone pool. 

10 

22. The polypeptide of claim 19, wherein said polypeptide comprises a signal peptide. 

23. The polypeptide of claim 19, wherein said polypeptide is a mature protein. 

15 24. An antibody that specifically binds to the polypeptide of claim 1 9. 

25. A method of determining whether a GENSET gene is expressed within a mammal, 
said method comprising the steps of: 

a) providing a biological sample from said mammal 
20 b) contacting said biological sample with either of: 

i) a polynucleotide that hybridizes under stringent conditions to the 
polynucleotide of claim 1; or 

ii) a polypeptide that specifically binds to the polypeptide of claim 19; and 
c) detecting the presence or absence of hybridization between said polynucleotide 

25 and an RNA species within said sample, or the presence or absence of binding 

of said polypeptide to a protein within said sample; 
wherein a detection of said hybridization or of said binding indicates that said GENSET gene is 
expressed within said mammal. 

30 26. The method of claim 25, wherein said polynucleotide is a primer, and wherein said 

hybridization is detected by detecting the presence of an amplification product comprising the 
sequence of said primer. 

27. The method of claim 25, wherein said polypeptide is an antibody. 

35 

28. A method of determining whether a mammal has an elevated or reduced level of 
GENSET gene expression, said method comprising the steps of : 
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a) providing a biological sample from said mammal; and 

b) comparing the amount of the polypeptide of claim 19, or of an RNA species 
encoding said polypeptide, within said biological sample with a level 
detected in or expected from a control sample; 

5 wherein an increased amount of said polypeptide or said RNA species within said biological 

sample compared to said level detected in or expected from said control sample indicates that said 
mammal has an elevated level of said GENSET gene expression, and wherein a decreased amount 
of said polypeptide or said RNA species within said biological sample compared to said level 
detected in or expected from said control sample indicates that said mammal has a reduced level of 
1 0 said GENSET gene expression. 

29. A method of identifying a candidate modulator of a GENSET polypeptide, said 
method comprising : 

a) contacting the polypeptide of claim 18 with a test compound; and 
1 5 b) determining whether said compound specifically binds to said 

polypeptide; 

wherein a detection that said compound specifically binds to said polypeptide indicates that 
said compound is a candidate modulator of said GENSET polypeptide. 

20 
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<400> 1 

agatgtgaat agctccacta taccagcctc gtcttccttc cgggggacaa cgtgggtcag 60 
35 ggcacagaga gatatttaat gtcaccctct tggggctttc atgggactcc ctctgccaca 120 
ttttttggag gttgggaaag ttgctagagg cttcagaact ccagccta atg gat ccc 177 

Met Asp Pro 
-25 

aaa etc ggg aga atg get gcg tec ctg ctg get gtg ctg ctg ctg ctg 225 
40 Lys Leu Gly Arg Met Ala Ala Ser Leu Leu Ala Val Leu Leu Leu Leu 

-20 -15 -10 

ctg ctg gag cgc ggc atg ttc tec tea ccc tec ccg ccc ccg gcg ctg 273 
Leu Leu Glu Arg Gly Met Phe Ser Ser Pro Ser Pro Pro Pro Ala Leu 
-5 15 
45 tta gag aaa gtc ttc cag tac att gac etc cat cag gat gaa ttt gtg 321 
Leu Glu Lys Val Phe Gin Tyr lie Asp Leu His Gin Asp Glu Phe Val 

10 ' 15 20 

cag acg ctg aag gag tgg gtg gee ate gag age gac tct gtc cag cct 369 
Gin Thr Leu Lys Glu Trp Val Ala lie Glu Ser Asp Ser Val Gin Pro 
50 25 30 35 40 

gtg cct cgc ttc aga caa gag etc ttc aga atg atg gee gtg get gcg 417 
Val Pro Arg Phe Arg Gin Glu Leu Phe Arg Met Met Ala Val Ala Ala 

45 50 55 

gac acg ctg cag cgc ctg ggg gee cgt gtg gee teg gtg gac atg ggt 465 
55 Asp Thr Leu Gin Arg Leu Gly Ala Arg Val Ala Ser Val Asp Met Gly 
60 65 70 

cct cag cag ctg ccc gat ggt cag agt ctt cca ata cct ccc gtc ate 513 
Pro Gin Gin Leu Pro Asp Gly Gin Ser Leu Pro lie Pro Pro Val lie 
75 80 85 

60 ctg gec gaa ctg ggg age gat ccc acg aaa ggc ace gtg tgc ttc tac 561 
Leu Ala Glu Leu Gly Ser Asp Pro Thr Lys Gly Thr Val Cys Phe Tyr 

90 95 100 

ggc cac ttg gac gtg cag cct get gac egg ggc gat ggg tgg etc acg 609 

1 



BNSDOCID: <WO 01 42451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 





Gly 


His 


Leu 


Asp 


Val 


Gin 


Pro 


Ala 


Asp 


Arg 


Gly 


Asp 


Gly 


Trp 


Leu 


Thr 






105 










110 










115 










120 






gac 


ccc 


tat 


gtg 


ctg 


a eg 


gag 


gta 


gac 


ggg 


aaa 


ctt 


tat 


gga 


cga 


gga 


657 




Asp 


Pro 


Tyr 


Val 


Leu 


Thr 


Glu 


Val 


Asp 


Gly 


Lys 


Leu 


Tyr 


Gly 


Arg 


Gly 




5 










125 










130 










135 








gcg 


acc 


gac 


aac 


aaa 


ggc 


cct 


gtc 


ttg 


get 


tgg 


ate 


aat 


get 


gtg 


age 


705 




Ala 


Thr 


Asp 


Asn 


Lys 


Gly 


Pro 


Val 


Leu 


Ala 


Trp 


He 


Asn 


Ala 


Val 


Ser 












140 










145 










150 










gcc 


ttc 


aga 


gcc 


ctg 


gag 


caa 


gat 


Ctt 


cct 


gtg 


aat 


ate 


aaa 


ttc 


ate 


753 


10 


Ala 


Phe 


Arg 


Ala 


Leu 


Glu 


Gin 


Asp 


Leu 


Pro 


Val 


Asn' 


He 


Lys 


Phe 


He 










155 










160 










165 












att 


9 a 9 


999 


atg 


gaa 


gag 


get 


gge 


tct 


gtt 


gcc 


ctg 


gag 


gaa 


ctt 


gtg 


801 




He 


Glu 


Gly 


Met 


Glu 


Glu 


Ala 


Gly 


Ser 


Val 


Ala 


Leu 


Glu 


Glu 


Leu 


val 








170 










175 










180 












15 


gaa 


aaa 


gaa 


aag 


gac 


cga 


ttc 


ttc 


tct 


ggt 


gtg 


gac 


tac 


att 


gta 


att 


849 




Glu 


Lys 


Glu 


Lys 


Asp 


Arg 


Phe 


Phe 


Ser 


Gly 


Val 


Asp 


Tyr 


lie 


Val 


lie 






185 










190 










195 










200 






tea 


gat 


aae 


ctg 


tgg 


ate 


age 


caa 


agg 


aag 


cca 


gca 


ate 


act 


tat 


gga 


897 




Ser 


Asp 


Asn 


Leu 


Trp 


He 


Ser 


Gin 


Arg 


Lys 


Pro 


Ala 


He 


Thr 


Tyr 


Gly 




20 










205 










210 










215 








acc 


egg 


999 


aac 


age 


tac 


ttc 


atg 


gtg 


gag 


gtg 


aaa 


tgc 


aga 


gac 


cag 


945 




Thr 


Arg 


Gly 


Asn 


Ser 


Tyr 


Phe 


Met 


Val 


Glu 


Val 


Lys 


Cys 


Arg 


Asp 


Gin 












220 










225 










230 










qat 


ttt 


cac 


tea 


gga 


acc 


ttt 


ggt 


ggc 


ate 


ctt 


cat 


gaa 


cca 


atg 


get 


993 


25 


Asp 


Phe 


His 


Ser 


Gly 


Thr 


Phe 


Gly 


Gly 


He 


Leu 


His 


Glu 


Pro 


Met 


Ala 










235 










240 










245 












qat 


ctg 


gtt 


get 


ctt 


etc 


ggt 


age 


ctg 


gta 


gac 


teg 


tct 


ggt 


cat 


ate 


1041 




Asp 


Leu 


val 


Ala 


Leu 


Leu 


Gly 


Ser 


Leu 


val 


Asp 


Ser 


Ser 


Gly 


His 


He 








250 










255 










260 












30 


ctq 


gtc 


cct 


gga 


ate 


tat 


gat 


gaa 


gtg 


gtt 


cct 


ctt 


aca 


gaa 


gag 


gaa 


1089 




Leu 


Val 


Pro 


Gly 


He 


Tyr 


Asp 


Glu 


val 


val 


Pro 


Leu 


Thr 


Glu 


Glu 


Glu 






265 










270 










275 










280 






a ta 


aat 


aca 


tac 


aaa 


qcc 


ate 


cat 


eta 


gac 


eta 


gaa 


gaa 


tac 


egg 


aat 


1137 




He 


Asn 


Thr 


Tyr 


Lys 


Ala 


lie 


His 


Leu 


Asp 


Leu 


Glu 


Glu 


Tyr 


Arg 


Asn 




35 










285 










290 










295 








age 


age 


egg 


gtt 


gag 


aaa 


ttt 


ctg 


ttc 


gat 


act 


aag 


gag 


gag 


att 


eta 


1185 




Ser 


Ser 


Arg 


Val 


Glu 


Lys 


Phe 


Leu 


Phe 


Asp 


Thr 


Lys 


Glu 


Glu 


lie 


Leu 












300 










305 










310 










atq 


cac 


etc 


tgg 


agg 


tac 


cca 


tct 


ctt 


tct 


att 


cat 


ggg 


ate 


gag 


gge 


1233 


40 


Met 


His 


Leu 


Trp 


Arg 


Tyr 


Pro 


Ser 


Leu 


Ser 


He 


His 


Gly 


lie 


Glu 


Gly 










315 










320 










325 












gcg 


ttt 


gat 


gag 


cct 


gga 


act 


aaa 


aca 


gtc 


ata 


cct 


gge 


cga 


gtt 


ata 


1281 




Ala 


Phe 


Asp 


Glu 


Pro 


Gly 


Thr 


Lys 


Thr 


Val 


He 


Pro 


Gly 


Arg 


Val 


He 








330 










335 










340 












45 


gga 


aaa 


ttt 


tea 


ate 


cgt 


eta 


gtc 


cct 


cac 


atg 


aat 


gtg 


tct 


gcg 


gtg 


1329 




Gly 


Lys 


Phe 


Ser 


He 


Arg 


Leu 


Val 


Pro 


His 


Met 


Asn 


Val 


Ser 


Ala 


Val 






345 










350 










355 










360 






gaa 


aaa 


cag 


gtg 


aca 


cga 


cat 


ctt 


gaa 


gat 


gtg 


ttc 


tec 


aaa 


aga 


aat 


1377 




Glu 


Lys 


Gin 


Val 


Thr 


Arg 


His 


Leu 


Glu 


Asp 


val 


Phe 


Ser 


Lys 


Arg 


Asn 




50 










365 










370 










375 








agt 


tec 


aac 


aag 


atg 


gtt 


gtt 


tec 


atg 


act 


eta 


gga 


eta 


cac 


ccg 


tgg 


1425 




Ser 


Ser 


Asn 


Lys 


Met 


Val 


Val 


Ser 


Met 


Thr 


Leu 


Gly 


Leu 


His 


Pro 


Trp 












380 










385 










390 










att 


gca 


aat 


att 


gat 


gac 


acc 


cag 


tat 


etc 


gca 


gca 


aaa 


aga 


gcg 


ate 


1473 


55 


He 


Ala 


Asn 


He 


Asp 


Asp 


Thr 


Gin 


Tyr 


Leu 


Ala 


Ala 


Lys 


Arg 


Ala 


lie 










395 










400 










405 












aga 


aca 


gtg 


ttt 


gg a 


aca 


gaa 


cca 


gat 


atg 


ate 


egg 


gat 


gga 


tec 


acc 


1521 




Arg 


Thr 


Val 


Phe 


Gly 


Thr 


Glu 


Pro 


Asp 


Met 


He 


Arg 


Asp 


Gly 


Ser 


Thr 








410 










415 










420 












60 


att 


cca 


att 


gcc 


aaa 


atg 


ttc 


cag 


gag 


ate 


gtc 


cac 


aag 


age 


gtg 


gtg 


1569 




He 


Pro 


lie 


Ala 


Lys 


Met 


Phe 


Gin 


Glu 


He 


val 


His 


Lys 


Ser 


Val 


Val 






425 










430 










435 










440 






eta 


att 


ccg 


ctg 


gga 


get 


gtt 


gat 


gat 


gga 


gaa 


cat 


teg 


cag 


aat 


gag 


1617 



2 
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10 



15 



Leu lie Pro Leu Gly Ala Val Asp Asp Gly Glu His Ser Gin Asn Glu 

445 450 455 

aaa ate aac agg tgg aac tac ata gag gga acc aaa tta ttt get gcc 
Lys He Asn Arg Trp Asn Tyr lie Glu Gly Thr Lys Leu Phe Ala Ala 

460 465 470 

ttt ttc tta gag atg gcc cag etc cat taatcacaag aaccttctag 
Phe Phe Leu Glu Met Ala Gin Leu His 

475 480 
tctgatctga tccactgaca gattcacctc ccccacatcc ctagacaggg atggaatgta 
aatatccaga gaatttgggt ctagtatagt acattttccc ttccatttaa aatgtcttgg 
gatatctgga tcagtaataa aatatttcaa aggcacagat gttggaaatg gtttaaggtc 
ccccactgca caccttcctc aagtcatagc tgettgeage aacttgattt ccccaagtcc 
tgtgcaatag ccccaggatt ggattccttc caacctttta gcatatctcc aaccttgeaa 
tttgattggc ataatcactc cagtttgett tctaggtcct caagtgeteg tgacacataa 
tcattccatc caatgatege etttgettta ccactctttc cttttatctt attaataaaa 
atgttggtct ccaccactga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaagaaaaaa 
aaaaaaaaa 



1665 



1712 



1772 
1832 
1892 
195.2 
2012 
2072 
2132 
2192 
2201 



<210> 2 

20 <211> 1631 

<212> DNA 

<213> Homo sapiens 



<220> 
25 <221> CDS 
<222> 148. 



.1140 



<220> 

<221> sig — peptide 
30 <222> 148 . .240 

<223> Von Heijne matrix 

score 10.0910253445132 

seq LVLLLVTRSPVNA/CL 

35 <400> 2 

gtctgctgcc gccattgtgc ggcgctggtc ccctcagagg gttcctgctg ctgccggtgc 
cttggaccct ccccctcgct tetegttcta ctgccccagg agcccggcgg gtcegggact 
cccgtccgtg ccggtgcggg cgccggc atg tgg ctg tgg gag gac cag ggc ggc 

Met Trp Leu Trp Glu Asp Gin Gly Gly 
40 -30 -25 

etc ctg ggc cct ttc tec ttc ctg ctg eta gtg ctg ctg ctg gtg acg 

Leu Leu Gly Pro Phe Ser Phe Leu Leu Leu Val Leu Leu Leu Val Thr 

-20 -15 -10 

egg age ccg gtc aat gcc tgc etc etc acc ggc age etc ttc gtt eta 

45 Arg Ser Pro Val Asn Ala Cys Leu Leu Thr Gly Ser Leu Phe Val Leu 
-5 1 5 10 

ctg cgc gtc ttc age ttt gag ccg gtg ccc tct tgc agg gcc ctg cag 

Leu Arg Val Phe Ser Phe Glu Pro Val Pro Ser Cys Arg Ala Leu Gin 
15 20 25 

50 gtg etc aag ccc egg gac cgc att tct gcc ate gcc cac cgt ggc ggc 

Val Leu Lys Pro Arg Asp Arg lie Ser Ala lie Ala His Arg Gly Gly 

30 35 40 

age cac gac gcg ccc gag aac acg ctg gcg gcc att egg cag gca get 

Ser His Asp Ala Pro Glu Asn Thr Leu Ala Ala lie Arg Gin Ala Ala 

55 45 50 55 

aag aat gga gca aca ggc gtg gag ttg gac att gag ttt act tct gac 

Lys Asn Gly Ala Thr Gly Val Glu Leu Asp lie Glu Phe Thr Ser Asp 

60 65 70 

ggg att cct gtc tta atg cac gat aac aca gta gat agg acg act gat 

60 Gly lie Pro Val Leu Met His Asp Asn Thr Val Asp Arg Thr Thr Asp 
75 80 85 90 

ggg act ggg cga. ttg tgt gat ttg aca ttt gaa caa att agg aag ctg 

Gly Thr Gly Arg Leu Cys Asp Leu Thr Phe Glu Gin lie Arg Lys Leu 



60 
120 
174 



222 



270 



318 



366 



414 



462 



510 



558 
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95 100 105 

aat cct gca gca aac cac aga etc agg aat gat ttc cct gat gaa aag 
Asn Pro Ala Ala Asn His Arg Leu Arg Asn Asp Phe Pro Asp Glu Lys 
110 115 120 

5 ate cct acc eta atg gaa get gtt gca gag tgc eta aac cat aac etc 
lie Pro Thr Leu Met Glu Ala Val Ala Glu Cys Leu Asn His Asn Leu 

125 130 135 

aca ate ttc ttt gat gtc aaa ggc cat gca cac aag get act gag get 
Thr lie Phe Phe Asp Val Lys Gly His Ala His Lys Ala Thr Glu Ala 
10 140 145 150 

eta aag aaa atg tat atg gaa ttt cct caa ctg tat aat aat agt gtg 
Leu Lys Lys Met Tyr Met Glu Phe Pro Gin Leu Tyr Asn Asn Ser Val 
155 160 165 170 

gtc tgt tct ttc ttg cca gaa gtt ate tac aag atg aga caa aca gat 
15 Val Cys Ser Phe Leu Pro Glu Val lie Tyr Lys Met Arg Gin Thr Asp 

175 180 185 

egg gat gta ata aca gca tta act cac aga cct tgg age eta age cat 
Arg Asp Val lie Thr Ala Leu Thr His Arg Pro Trp Ser Leu Ser His 
190 195 200 

20 aca gga gat ggg aaa cca cgc tat gat act ttc tgg aaa cat ttt ata 
Thr Gly Asp Gly Lys Pro Arg Tyr Asp Thr Phe Trp Lys His Phe lie 

205 * 210 215 

ttt gtt atg atg gac att ttg etc gat tgg age atg cat aat ate ttg 
Phe Val Met Met Asp lie Leu Leu Asp Trp Ser Met His Asn lie Leu 
25 220 225 230 

tgg tac ctg tgt gga att tea get ttc etc atg caa aag gat ttt gta 
Trp Tyr Leu Cys Gly lie Ser Ala Phe Leu Met Gin Lys Asp Phe Val 
235 240 245 250 

tec ccg gee tac ttg aag aag tgg tea get aaa gga ate cag gtt gtt 
30 Ser Pro Ala Tyr Leu Lys Lys Trp Ser Ala Lys Gly lie Gin Val Val 

255 260 265 

ggt tgg act gtt aat acc ttt gat gaa aag agt tac tac gaa tec cat 
Gly Trp Thr Val Asn Thr Phe Asp Glu Lys Ser Tyr Tyr Glu Ser His 
270 275 280 

35 ctt ggt tec age tat ate act gac age atg gta gaa gac tgc gaa cct 
Leu Gly Ser Ser Tyr lie Thr Asp Ser Met Val Glu Asp Cys Glu Pro 

285 290 295 

cac ttc tagactttca cggtgggacg aaacgggttc agaaactgee aggggectea 
His Phe 
40 300 

tacagggata tcaaaatacc ctttgtgcta gcccaggccc tggggaatca ggtgactcac 
acaaatgeaa tagttggtca ctgcattttt acctgaacca aagctaaacc cggtgttgcc 
accatgcacc atggcatgcc agagttcaac actgttgctc ttgaaaatct ggggtctgaa 
aaaaegcaca agagcccctg ccctgcccta gctgaggcac acagggagac ccagtgagga 
45 taagcacaga ttgaattgta caatttgeag atgcagatgt aaatgcatgg gaeatgeatg 
ataactcaga gttgacattt taaaacttgc cacacttatt tcaaatattt gtactcagct 
atgttaacat gtactgtaga catcaaactt gtggccatac taataaaatt attaaaagga 
gcaeaaaaaa aaaaaaaaaa a 



606 



654 



702 



750 



798 



846 



894 



942 



990 



1038 



1086 



1134 



1190 



1250 
1310 
1370 
1430 
1490 
1550 
1610 
1631 



50 <210> 3 

<211> 1245 

<212> DNA 

<213> Homo sapiens 

55 <220> 

<221> CDS 

<222> 85. .906 



<220> 

60 <221> sig_peptide 
<222> 85. .135 
<223> Von Heijne matrix 

score 3.86022363031904 
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seq GFVAALVAGGVAG/VS 



<400> 3 

aaaacatggc ggcgcccagc gcgcgaggac gtgatccgct tctgctccgg cttggattgt 
5 agccttgacg aggtctgagc gacc atg gac egg ccg ggg ttc gtg gca gcg 

Met Asp Arg Pro Gly Phe Val Ala Ala 
-15 -10 
ctg gtg get ggt ggg gta gca ggt gtt tct gtt gac ttg ata tta ttt 
Leu Val Ala Gly Gly Val Ala Gly Val Ser Val Asp Leu lie Leu Phe 
10 -5 1 5 

cct ctg gat acc att aaa acc agg ctg cag agt ccc caa gga ttt agt 
Pro Leu Asp Thr He Lys Thr Arg Leu Gin Ser Pro Gin Gly Phe Ser 

10 15 20 

aag get ggt ggt ttt cat gga ata tat get ggc gtt cct tct get get 
15 Lys Ala Gly Gly Phe His Gly He Tyr Ala Gly Val Pro Ser Ala Ala 
25 30 35 40 

att gga tec ttt cct aat get get gca ttt ttt ate acc tat gaa tat 
He Gly Ser Phe Pro Asn Ala Ala Ala Phe Phe He Thr Tyr Glu Tyr 
45 50 55 

20 gtg aag tgg ttt ttg cat get gat tea tct tea tat ttg aca cct atg 
Val Lys Trp Phe Leu His Ala Asp Ser Ser Ser Tyr Leu Thr Pro Met 

60 65 70 

aaa cat atg ttg get gee tct get gga gaa gtg gtt gee tgc ctg att 
Lys His Met Leu Ala Ala Ser Ala Gly Glu Val Val Ala Cys Leu He 
25 75 80 85 

cga gtt cca tct gaa gtg gtt aag cag agg gca cag gta tct get tct 
Arg Val Pro Ser Glu Val Val Lys Gin Arg Ala Gin Val Ser Ala Ser 

90 95 100 

aca aga aca ttt cag att ttc tct aac ate tta tat gaa gag ggt ate 
30 Thr Arg Thr Phe Gin He Phe Ser Asn He Leu Tyr Glu Glu Gly He 
105 110 115 120 

caa ggg ttg tat cga ggc tat aaa age aca gtt tta aga gag att cct 
Gin Gly Leu Tyr Arg Gly Tyr Lys Ser Thr Val Leu Arg Glu He Pro 
125 130 135 

35 ttt tct ttg gtc cag ttt ccc tta tgg gag tec tta aaa gee etc tgg 
Phe Ser Leu Val Gin Phe Pro Leu Trp Glu Ser Leu Lys Ala Leu Trp 

140 145 150 

tec tgg agg cag gat cat gtg gtg gat tct tgg cag tea gca gtc tgt 
Ser Trp Arg Gin Asp His Val Val Asp Ser Trp Gin Ser Ala Val Cys 
40 155 160 165 

gga get ttt gca ggt gga ttt gec get gca gtc acc acc cct eta gac 
Gly Ala Phe Ala Gly Gly Phe Ala Ala Ala Val Thr Thr Pro Leu Asp 

170 175 180 

gtg gca aag aca aga att atg ctg gca aag get ggc tec age act get 
45 Val Ala Lys Thr Arg He Met Leu Ala Lys Ala Gly Ser Ser Thr Ala 
185 190 195 200 

9^t ggg aat gtg etc tct gtc ctg cat ggg gtc tgg egg tea cag ggg 
Asp Gly Asn Val Leu Ser Val Leu His Gly Val Trp Arg Ser Gin Gly 
205 210 215 

50 ctg gca gga tta ttt gca ggt gtc ttc cct cga atg gca gee ate agt 
Leu Ala Gly Leu Phe Ala Gly Val Phe Pro Arg Met Ala Ala He Ser 

220 225 230 

ctg gga ggt ttc ate ttt ctg ggg get tat gac cga acg cac age ttg 
Leu Gly Gly Phe He Phe Leu Gly Ala Tyr Asp Arg Thr His Ser Leu 
55 235 240 245 

ctg ttg gaa gtt ggc aga aag agt cct tgaagcagag acaagcctca 
Leu Leu Glu Val Gly Arg Lys Ser Pro 

250 255 
cctccacttc tgtcaagaga ggggectgea gtgcaaaccc tcttccgctg agcagctgtc 
60 tgaactatag gccccagtgc tgaagaccag ttgtgctaag ataceggcat ggagattgtg 
ccatccgtgg tataggctgg ctggtatgaa gtcattggcc tgtatgccag agagctaaga 
gaagaaaacg gggtctgtgg eggtactctg aacaatttcc tcagaacctc ttaataaata 
agtttggtaa tgctgagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 



60 
111 



159 



207 



255 



303 



351 



399 



447 



495 



543 



591 



639 



687 



735 



783 



831 



879 



926 



986 
1046 
1106 
1166 
1226 
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agaaaaaaaa aaaaaaaaa 1245 

<210> 4 
<211> 1623 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 31 . . 1248 

<220> 

<221> sig_peptide 
<222> 31. .135 
15 <223> Von Heijne matrix 

score 6.3770152988307 

seq TLLLFAAPFGLLG/EK 

<400> 4 

20 aacctcttcc gtcggctgaa ttgcggccgt atg cgc ggc tct gtg gag tgc acc 54 

Met Arg Gly Ser Val Glu Cys Thr 
-35 -30 
tgg ggt tgg ggg cac tgt gcc ccc age ccc ctg etc ctt tgg act eta 102 
Trp Gly Trp Gly His Cys Ala Pro Ser Pro Leu Leu Leu Trp Thr Leu 
25 *" -25 -20 -15 

ctt ctg ttt gca gcc cca ttt ggc ctg ctg ggg gag aag acc cgc cag 150 
Leu Leu Phe Ala Ala Pro Phe Gly Leu Leu Gly Glu Lys Thr Arg Gin 

-10 -5 15 

gtg tct ctg gag gtc ate cct aac tgg ctg ggc ccc ctg cag aac ctg 198 
30 Val Ser Leu Glu Val lie Pro Asn Trp Leu Gly Pro Leu Gin Asn Leu 

10 15 20 

ctt cat ata egg gca gtg ggc acc aat tec aca ctg cac tat gtg tgg 246 
Leu His lie Arg Ala Val Gly Thr Asn Ser Thr Leu His Tyr Val Trp 
25 30 35 

35 age age ctg ggg cct ctg gca gtg gta atg gtg gcc acc aac acc ccc 2 94 

Ser Ser Leu Gly Pro Leu Ala Val Val Met Val Ala Thr Asn Thr Pro 

40 45 50 

cac age acc ctg age gtc aac tgg age etc ctg eta tec cct gag ccc 342 
His Ser Thr Leu Ser Val Asn Trp Ser Leu Leu Leu Ser Pro Glu Pro 
40 55 60 65 

9^t ggg ggc ctg atg gtg etc cct aag gac age att cag ttt tct tct 3 90 

Asp Gly Gly Leu Met Val Leu Pro Lys Asp Ser lie Gin Phe Ser Ser 
70 75 80 85 

gcc ctt gtt ttt acc agg ctg ctt gag ttt gac age acc aac gtg tec 438 
45 Ala Leu Val Phe Thr Arg Leu Leu Glu Phe Asp Ser Thr Asn Val Ser 

90 95 100 

gat acg gca gca aag cct ttg gga aga cca tat cct cca tac tec ttg 4 86 

Asp Thr Ala Ala Lys Pro Leu Gly Arg Pro Tyr Pro Pro Tyr Ser Leu 
105 110 115 

50 gcc gat ttc tct tgg aac aac ate act gat tea ttg gat cct gcc acc 534 
Ala Asp Phe Ser Trp Asn Asn lie Thr Asp Ser Leu Asp Pro Ala Thr 

120 125 130 

ctg agt gcc aca ttt caa ggc cac ccc atg aac gac cct acc agg act 582 
Leu Ser Ala Thr Phe Gin Gly His Pro Met Asn Asp Pro Thr Arg Thr 
55 135 140 145 

ttt gcc aat ggc age ctg gcc ttc agg gtc cag gcc ttt tec agg tec 630 
Phe Ala Asn Gly Ser Leu Ala Phe Arg Val Gin Ala Phe Ser Arg Ser 
150 155 160 165 

age cga cca gcc caa ccc cct cgc etc ctg cac aca gca gac acc tgt 678 
60 Ser Arg Pro Ala Gin Pro Pro Arg Leu Leu His Thr Ala Asp Thr Cys 

170 175 180 

cag eta gag gtg gcc. ctg att gga gcc tct ccc egg gga aac cgt tec 726 
Gin Leu Glu Val Ala Leu lie Gly Ala Ser Pro Arg Gly Asn Arg Ser 
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185 190 195 

ctg ttt ggg ctg gag gta gcc aca ttg ggc cag ggc cct gac tgc ccc 774 

Leu Phe Gly Leu Glu Val Ala Thr Leu Gly Gin Gly Pro Asp Cys Pro 

200 205 210 

5 tea atg cag gag cag cac tec ate gac gat gaa tat gca ccg gcc gtc 822 

Ser Met Gin Glu Gin His Ser lie Asp Asp Glu Tyr Ala Pro Ala Val 

215 220 225 

ttc cag ttg gac cag eta ctg tgg ggc tec etc cca tea ggc ttt gca 870 

Phe Gin Leu Asp Gin Leu Leu Trp Gly Ser Leu Pro Ser Gly Phe Ala 

10 230 235 240 245 

cag tgg cga cca gtg get tac tec cag aag ccg ggg ggc cga gaa tea 918 

Gin Trp Arg Pro Val Ala Tyr Ser Gin Lys Pro Gly Gly Arg Glu Ser 

250 255 260 

gcc ctg ccc tgc caa get tec cct ctt cat cct gcc tta gca tac tct 966 

15 Ala Leu Pro Cys Gin Ala Ser Pro Leu His Pro Ala Leu Ala Tyr Ser 

265 270 275 

ctt ccc cag tea ccc att gtc cga gcc ttc ttt ggg tec cag aat aac 1014 

Leu Pro Gin Ser Pro lie Val Arg Ala Phe Phe Gly Ser Gin Asn Asn 

280 285 290 

20 ttc tgt gcc ttc aat ctg acg ttc ggg get tec aca ggc cct ggc tat 1062 

Phe Cys Ala Phe Asn Leu Thr Phe Gly Ala Ser Thr Gly Pro Gly Tyr 

295 300 305 

tgg gac caa cac tac etc age tgg teg atg etc ctg ggt gtg ggc ttc 1110 

Trp Asp Gin His Tyr Leu Ser Trp Ser Met Leu Leu Gly Val Gly Phe 

25 310 315 320 325 

cct cca gtg gac ggc ttg tec cca eta gtc ctg ggc ate atg gca gtg 1158 

Pro Pro Val Asp Gly Leu Ser Pro Leu Val Leu Gly lie Met Ala Val 

330 335 340 

gcc ctg ggt gcc cca ggg etc atg ctg eta ggg ggc ggc ttg gtt ctg 1206 

30 Ala Leu Gly Ala Pro Gly Leu Met Leu Leu Gly Gly Gly Leu Val Leu 

345 350 355 

ctg ctg cac cac aag aag tac tea gag tac cag tec ata aat 1248 

Leu Leu His His Lys Lys Tyr Ser Glu Tyr Gin Ser lie Asn 

360 365 370 

35 taaggcccgc tctctggagg gaaggacatt actgaacctg tcttgctgtg cctcgaaact 1308 

ctggaggttg gagcatcaag ttccagcccc cttcactccc ccatcttgct tttctgtgga 1368 

acctcagagg ccagcctcga cttcctggag acccccaggt ggggcttcct tcatactttg 1428 

ttgggggact ttggaggcgg gcaggggaca gggctattga taaggtcccc ttggtgttgc 14 88 

ettcttgeat ctccacacat ttcccttgga tgggacttgc aggectaaat gagaggcatt 1548 

40 ctgactggtt ggctgccctg gaaggcaaga aaatagattt attttttttt cacagggcaa 1608 

aaaaaaaaaa aaaaa 1623 

<210> 5 
<211> 1454 
45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 72 . . 143 

<220> 

<221> sig_peptide 
<222> 72 . .119 
55 <223> Von Heijne matrix 

score 5.68931280801877 

seq LGMLLGLLMAACT/PS 

<400> 5 

60 gtgtctgcca ctcggctgcc ggaggccgaa ggtccctgac tatggctccc cagagcctgc 60 
cttcatctag g atg get cct ctg ggc atg ctg ctt ggg ctg ctg atg gcc 110 
Met Ala Pro Leu Gly Met Leu Leu Gly Leu Leu Met Ala 
-15 -10 -5 
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gcc tgc aca 
Ala Cys Thr 

aacccagaga 
5 ctggatgccg 
gggcaggctg 
gcaaaactcc 
aacaccaaca 
gcagagatgg 

10 cgccccattg 
atgcagatca 
aagattgctg 
ctgctttcct 
gtgaaggagt 

15 gtggaggcca 
ccgctcactg 
cacggagaag 
gctgcagcag 
gatcacggcc 

20 actgggcgtc 
gacactggcc 
tgaggacgag 
gagatgaggc 
gcgtgggtgg 

25 gaaacctgaa 



cct tct gcc tea gtc ate aga acc 
Pro Ser Ala Ser Val lie Arg Thr 
1 5 
agagcagcac caaagaaacg gagagaaaag 
aagtcctgga ggtgttccac ccgacgcatg 
tccctgcagg atcccacgta eggctgaate 
aatatgagga caagttccga aataatttga 
cctacacatc tcaggatctc aagagtgcac 
agagttcaaa ggaagacaag gcaaggcagg 
aggaactgaa gaaagacttt gatgagctga 
tggtaegget gatcaacaag ttcaatagtt 
cgctctttga tcttgaatat tatgtccatc 
ttggtggtct tcaagtggtg atcaatgggc 
atgctgcgtt tgtgctgggc getgectttt 
tcgaaggggg agccctgcag aagctgctgg 
caaagggagg tgctcaccgt gcgcgtggtc 
atgttcgccg aggaggaggc tgagctgacc 
tatcgecagg tacacctcct gccakgcctg 
cacctcctgg cgctgcccga geatgatgee 
ctcctgacca cctgccggga ccgctaccgt 
agectgeagg ctgagtacca ggtgctggcc 
ggctacttcc aggagctget gggctctgtc 
cccacaccag gactggactg ggatgecget 
gcttctcagg caggaggaca tcttggcagt 
ggccaaaaaa aaaaaaaaaa a 



tgaaggagtt tgccctgacc 



aaaccaaagc 
agtggcaggc 
ttcagactgg 
aaggcaaaag 
tggcaaaatt 
ctgaggtaaa 
atgttgtcat 
ccagctccag 
agatggacaa 
tgaacagcac 
ccagcaaccc 
tcatcctggc 
acactgctct 
caggagatgt 
tgggaacagg 
ygtgagaagg 
caggaccccc 
agectggage 
aacagcttgc 
agtgaggctg 
gctggcttgg 



cgaggaggag 
ccttcagcca 
ggaaagagag 
gctggatatc 
caaggagggg 
gcggctcttc 
tgagactgac 
tttggaagag 
tgegcaggae 
agagcccctc 
caaggtccag 
caeggagcag 
acgacctggt 
ccccagagaa 
gctggtgcga 
tgetgewgae 
ageteggcag 
tgcaggatgg 
tgaaggagct 
aggggtgeca 
ccattaaatg 



163 



223 
283 
343 
403 
463 
523 
583 
643 
703 
763 
823 
883 
943 
1003 
1063 
1123 
1183 
1243 
1303 
1363 
1423 
1454 



<210> 6 

<211> 1639 

<212> DNA 

30 <213> Homo sapiens 



35 



40 



<220> 
<221> CDS 
<222> 111. . 1154 

<220> 

<221> sig_peptide 

<222> 111. .197 

<223> Von Heijne matrix 

score 4.68065944212013 
seq LLGPLMAACFTFC/LS 



45 



<400> 6 

agaeggtege cgccgcgttt gcgcaggggg agetggtege 
gtgggagttg tgtctgccac tcggctgccg gaggecgaag 



ccc cag 

Pro Gin 

50 ggg ccg 

Gly Pro 
-10 

ctg aag 

Leu Lys 

55 

aca gag 

Thr Glu 

ctg gag 

60 Leu Glu 

cag get 

Gin Ala 



age ctg cct tea tct agg 
Ser Leu Pro Ser Ser Arg 
-25 -20 
ctg atg gcc gcc tgc ttc 
Leu Met Ala Ala Cys Phe 
-5 

gag ttt gcc ctg acc aac 
Glu Phe Ala Leu Thr Asn 
10 

aga aaa gaa acc aaa gee 
Arg Lys Glu Thr Lys Ala 
25 

gtg ttc cac ccg acg cat 
Val Phe His Pro Thr His 
40 45 
gtc cct gca gga tec cac 
Val Pro Ala Gly Ser His 



atg get cct 
Met Ala Pro 

acc ttc tgc 
Thr Phe Cys 

cca gag aag 
Pro Glu Lys 
15 

gag gag gag 
Glu Glu Glu 
30 

gag tgg cag 
Glu Trp Gin 

gta egg ctg 
Val Arg Leu 

8 



cgccgcggcc gectggaatt 
gtccctgact atg get 
Met Ala 
ctg ggc atg ctg ctt 
Leu Gly Met Leu Leu 
-15 

etc agt cat cag aac 
Leu Ser His Gin Asn 
1 5 
age age acc aaa gaa 
Ser Ser Thr Lys Glu 
20 

ctg gat gcc gaa gtc 
Leu Asp Ala Glu Val 
35 

gcc ctt cag cca ggg 
Ala Leu Gin Pro Gly 
50 

aat ctt cag act ggg 
Asn Leu Gin Thr Gly 



60 
116 

164 



212 



260 



308 



356 



404 
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55 60 65 

gaa aga gag gca aaa etc caa tat gag gac aag ttc cga aat aat ttg 452 

Glu Arg Glu Ala Lys Leu Gin Tyr Glu Asp Lys Phe Arg Asn Asn Leu 
70 75 80 85 

5 aaa ggc aaa agg ctg gat ate aac ace aac ace tac aca tct cag gat 500 

Lys Gly Lys Arg Leu Asp lie Asn Thr Asn Thr Tyr Thr Ser Gin Asp 

90 95 100 

etc aag agt gca ctg gca aaa ttc aag gag ggg gca gag atg gag agt 54 8 

Leu Lys Ser Ala Leu Ala Lys Phe Lys Glu Gly Ala Glu Met Glu Ser 

10 105 110 115 

tea aag gaa gac aag gca agg cag get gag gta aag egg etc ttc cgc 596 

Ser Lys Glu Asp Lys Ala Arg Gin Ala Glu Val Lys Arg Leu Phe Arg 

120 125 130 

ccc att gag gaa ctg aag aaa gac ttt gat gag ctg aat gtt gtc att 644 

15 Pro lie Glu Glu Leu Lys Lys Asp Phe Asp Glu Leu Asn Val Val lie 

135 140 145 

gag act gac atg cag ate atg gta egg ctg ate aac aag ttc aat agt 692 

Glu Thr Asp Met Gin lie Met Val Arg Leu lie Asn Lys Phe Asn Ser 
150 155 160 165 

20 tec age tec agt ttg gaa gag aag att get gcg etc ttt gat ctt gaa 740 

Ser Ser Ser Ser Leu Glu Glu Lys lie Ala Ala Leu Phe Asp Leu Glu 

170 175 180 

tat tat gtc cat cag atg gac aat gcg cag gac ctg ctt tec ttt ggt 788 

Tyr Tyr Val His Gin Met Asp Asn Ala Gin Asp Leu Leu Ser Phe Gly 

25 185 190 195 

ggt ctt caa gtg gtg ate aat ggg ctg aac age aca gag ccc etc gtg 836 

Gly Leu Gin Val Val lie Asn Gly Leu Asn Ser Thr Glu Pro Leu Val 

200 205 210 

aag gag tat get gcg ttt gtg ctg ggc get gee ttt tec age aac ccc 884 

30 Lys Glu Tyr Ala Ala Phe Val Leu Gly Ala Ala Phe Ser Ser Asn Pro 

215 220 225 

aag gtc cag gtg gag gee ate gaa ggg gga gee ctg cag aag ctg ctg 932 

Lys Val Gin Val Glu Ala lie Glu Gly Gly Ala Leu Gin Lys Leu Leu 
230 235 240 245 

35 gtc ate ctg gee acg gag cag ccg etc act gca aag aag aag gtc ctg 980 

Val lie Leu Ala Thr Glu Gin Pro Leu Thr Ala Lys Lys Lys Val Leu 

250 255 260 

ttt gca ctg tgc tec ctg ctg cgc cac ttc ccc tat gee cag egg cag 1028 

Phe Ala Leu Cys Ser Leu Leu Arg His Phe Pro Tyr Ala Gin Arg Gin 

40 265 270 275 

ttc ctg aag etc ggg ggg ctg cag gtc ctg agg ace ctg gtg cag gag 1076 

Phe Leu Lys Leu Gly Gly Leu Gin Val Leu Arg Thr Leu Val Gin Glu 

280 285 290 

aag ggc acg gag gtg etc gec gtg cgc gtg gtc aca ctg etc tac gac 1124 

45 Lys Gly Thr Glu Val Leu Ala Val Arg Val Val Thr Leu Leu Tyr Asp 

295 300 305 

ctg gtc acg gag aag atg ttc gee gag gag taggctgagc tgacccagga 1174 
Leu Val Thr Glu Lys Met Phe Ala Glu Glu 
310 315 

50 gatgtcccca gagaagctgc agcagtatcg ccaggtacac ctcctgccag gcctgtggga 1234 

acagggctgg tgegagatea cggcccacct cctggcgctg cccgagcatg atgcccgtga 1294 

gaaggtgctg cagacactgg gcgtcctcct gaccacctgc cgggaccgct acegtcagga 1354 

cccccagctc ggcaggacac tggccagcct gcaggctgag taccaggtgc tggccagcct 1414 

ggagctgcag gatggtgagg acgagggcta cttccaggag ctgctgggct ctgtcaacag 14 74 

55 ettgetgaag gagctgagat gaggccccac accaggactg gactgggatg ccgctagtga 1534 

ggctgagggg tgccagcgtg ggtgggcttc tcaggcagga ggacatcttg gcagtgctgg 1594 

ettggecatt aaatggaaac ctgaaggcaa aaaaaaaaaa aaaaa 1639 

<210> 7 
60 <211> 1768 
<212> DNA 

<213> Homo sapiens 



9 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 



PCT71B00/01938 



<220> 
<221> CDS 
<222> 66. .1256 

5 <220> 

<221> sig_peptide 

<222> 66. .173 

<223> Von Heijne matrix 

score 4.89555877630516 
10 seq LLLLRLNDAALRA/LQ 

<400> 7 

agaggaggtg gcggtggtgg ccctcgcctg tggcccccgt gctgcttgca ctcgaactcg 60 
tcgcc atg gag gag etc cag gag cct ctg aga gga cag etc egg etc tgc 110 

15 Met Glu Glu Leu Gin Glu Pro Leu Arg Gly Gin Leu Arg Leu Cys 

-35 -30 -25 

ttc acg caa get gee egg act age etc tta ctg etc agg etc aac gac 158 

Phe Thr Gin Ala Ala Arg Thr Ser Leu Leu Leu Leu Arg Leu Asn Asp 

-20 -15 -10 

20 get gee ctg egg gcg ctg caa gag tgt cag egg caa cag gta egg ccg 2 06 

Ala Ala Leu Arg Ala Leu Gin Glu Cys Gin Arg Gin Gin Val Arg Pro 

-5 15 10 

gtg att get ttc caa ggc cac cga ggg tat ctg aga etc cca ggc cct 254 

Val lie Ala Phe Gin Gly His Arg Gly Tyr Leu Arg Leu Pro Gly Pro 

25 15 ~ 20 25 

ggt tgg tec tgc etc ttc tec ttc ata gtg tec cag tgt tgt cag gag 302 

Gly Trp Ser Cys Leu Phe Ser Phe lie Val Ser Gin Cys Cys Gin Glu 

30 35 40 

ggc get ggt ggt age ttg gac ctt gtg tgc caa cgc ttc etc agg tct 350 

30 Gly Ala Gly Gly Ser Leu Asp Leu Val Cys Gin Arg Phe Leu Arg Ser 

45 50 55 

ggg cct aac age etc cac tgc ctg ggc tea etc agg gag cgc etc att 398 

Gly Pro Asn Ser Leu His Cys Leu Gly Ser Leu Arg Glu Arg Leu lie 

60 65 70 75 

35 att tgg gca gee atg gat tct ate cca gee cca tea tea gtt cag gga 44 6 

lie Trp Ala Ala Met Asp Ser lie Pro Ala Pro Ser Ser Val Gin Gly 

80 85 90 

cac aac ctg act gaa gat gee aga cat cct gag agt tgg cag aac aca 4 94 

His Asn Leu Thr Glu Asp Ala Arg His Pro Glu Ser Trp Gin Asn Thr 

40 95 100 105 

gga gg c tat tct g aa gg a g at g ca g ta tca ca g cca ca g at g g° a cta 542 

Gly Gly Tyr Ser Glu Gly Asp Ala Val Ser Gin Pro Gin Met Ala Leu 

110 115 120 

gag gag gtg tca gtg tca gat cca ctg gca age aac caa gga cag tca 590 

45 Glu Glu Val Ser Val Ser Asp Pro Leu Ala Ser Asn Gin Gly Gin Ser 
125 130 135 

etc cca gga tec tca agg gag cac atg gca cag tgg gaa gtg aga age 63 8 

Leu Pro Gly Ser Ser Arg Glu His Met Ala Gin Trp Glu Val Arg Ser 
140 145 150 155 

50 cag ace cat gtt cca aac aga gaa cct gtt cag gca ctg cct tec tct 686 

Gin Thr His Val Pro Asn Arg Glu Pro Val Gin Ala Leu Pro Ser Ser 

160 165 170 

gee age egg aaa cgt ctg gac aag aaa cgt tca gtg cct gta gee act 734 

Ala Ser Arg Lys Arg Leu Asp Lys Lys Arg Ser Val Pro Val Ala Thr 

55 175 180 185 

gta gaa ctg gaa gaa aag agg ttc aga act ctg cct tta gtg cca age 782 

Val Glu Leu Glu Glu Lys Arg Phe Arg Thr Leu Pro Leu Val Pro Ser 

190 195 200 

ccc cta caa ggc ctg acc aat cag gat tta caa gag gga gaa gat tgg 830 

60 Pro Leu Gin Gly Leu Thr Asn Gin Asp Leu Gin Glu Gly Glu Asp Trp 
205 210 215 

gag caa gaa gat gag gac atg gac ccc aga tta gaa cac agt tec tca 878 

Glu Gin Glu Asp Glu Asp Met Asp Pro Arg Leu Glu His Ser Ser Ser 

10 
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10 



15 



20 



25 



30 



35 



40 



220 










225 










230 










235 




gtt 


caa 


gaa 


gat 


tct 


gaa 


tec 


cca 


agt 


cct 


gaa 


gat 


ata 


cca 


gac 


tac 


926 


val 


Gin 


Glu 


Asp 


Ser 
240 


Glu 


Ser 


Pro 


Ser 


Pro 
245 


Glu 


Asp 


He 


Pro 


Asp 
250 


Tyr 




etc 


ctg 


caa 


tac 


agg 


gee 


ate 


cac 


agt 


gca 


gaa 


cag 


caa 


cat 


gee 


tat 


974 


Leu 


Leu 


Gin 


Tyr 
255 


Arg 


Ala 


He 


His 


Ser 
260 


Ala 


Glu 


Gin 


Gin 


His 
265 


Ala 


Tyr 




gag 


cag 


gac 


ttt 


gag 


aca 


gat 


tat 


get 


gaa 


tac 


cgc 


ate 


ctg 


cat 


gee 


1022 


Glu 


Gin 


Asp 
270 


Phe 


Glu 


Thr 


Asp 


Tyr 
275 


Ala 


Glu 


Tyr 


Arg 


He 
280 


Leu 


His 


Ala 




cgt 


gtt 


999 


act 


gca 


age 


caa 


agg 


ttc 


ata 


gag 


ctg 


gga 


gca 


gag 


att 


1070 


Arg 


Val 
285 


Gly 


Thr 


Ala 


Ser 


Gin 
290 


Arg 


Phe 


He 


Glu 


Leu 
295 


Gly 


Ala 


Glu 


He 




aaa 


aga 


gtt 


egg 


cga 


gga 


act 


cca 


gaa 


tac 


aag 


gtc 


ctg 


gaa 


gac 


aag 


1118 


Lys 


Arg 


Val 


Arg 


Arg 


Gly 


Thr 


Pro 


Glu 


Tyr 


Lys 


Val 


Leu 


Glu 


Asp 


Lys 




300 










305 










310 










315 




ata 


ate 


cag 


gaa 


tat 


aaa 


aag 


ttc 


agg 


aag 


cag 


tac 


cca 


agt 


tac 


aga 


1166 


He 


He 


Gin 


Glu 


Tyr 
320 


Lys 


Lys 


Phe 


Arg 


Lys 
325 


Gin 


Tyr 


Pro 


Ser 


Tyr 
330 


Arg 




gaa 


gaa 


aag 


cgt 


cgc 


tgt 


gag 


tac 


ctt 


cac 


cag 


aaa 


ttg 


tec 


cac 


att 


1214 


Glu 


Glu 


Lys 


Arg 
335 


Arg 


Cys 


Glu 


Tyr 


Leu 
340 


His 


Gin 


Lys 


Leu 


Ser 
345 


His 


He 




aaa 


ggt 


etc 


ate 


ctg 


gag 


ttt 


gag 


gaa 


aag 


aac 


agg 


ggc 


age 






1256 


Lys 


Gly 


Leu 
350 


He 


Leu 


Glu 


Phe 


Glu 
355 


Glu 


Lys 


Asn 


Arg 


Gly 
360 


Ser 









tgaagttatc aagggaattt ttgagectet gcttagtgaa 
tataaactaa atagaatgea actatctget tttcttatgc 



aggttcttga ggtttggttt 
ccatagctgt gectaggagt 



ggcaagtaga gagctgetet 
atgggcactg tgcaaagact 
ttggcttttt tacctttagt teagecaagt cattttcaag 
ttcaggataa aataatgagg acattagaca aaccaaacta 
cctctctaag gaaacagtaa taataacttc tgataagagt 
ctggatataa tgggaaaggg cctgggtgtt acccatgtac 
catggctaaa aaattaaaaa aaaaaaaaaa aa 



acacaaagga 
tgaccactgg 
tcattattaa 
ctaggaaaag 
tcctgagaaa 
agtgaatttt 
taaaagaact 
tgaaaatgaa 



acaaagcagc 
agtccatggt 
tttttagggt 
tgacagaggc 
tgacatcatc 
agectggtag 
tgtagcatac 
cttttaccaa 



<210> 
<211> 
<212> 



8 

1510 
DNA 



<213> Homo sapiens 



<220> 
<221> 
<222> 



1316 
1376 
1436 
1496 
1556 
1616 
1676 
1736 
1768 



CDS 
190 . 



1398 



45 <220> 

<221> sig_ peptide 

<222> 190 . .252 

<223> Von Heijne matrix 

score 5.8172934575094 
50 seq ALLWAQEVGQVLA/GR 

<400> 8 

acggttgccc tggcagcgcg cgaggctggt gagteggcag ccctgtggca gccggcgggc 
tggtttccat ggttgcacga ttaggaacca ccagctgctg catcccatgg ccaggggtgg 
55 cgtccaggtg gcagagcagc taggaacgea aggectgaac ctggggccag acaccctgct 
ctcccggcc atg gtc aac gac cct cca gta cct gec tta ctg tgg gee cag 
Met Val Asn Asp Pro Pro Val Pro Ala Leu Leu Trp Ala Gin 
-20 -15 -10 

9ag gtg ggc caa gtc ttg gca ggc cgt gee cgc agg ctg ctg ctg cag 
60 Glu Val Gly Gin Val Leu Ala Gly Arg Ala Arg Arg Leu Leu Leu Gin 
-5 15 
ttt 999 9tg etc ttc tgc acc ate etc ctt ttg etc tgg gtg tct gtc 
Phe Gly Val Leu Phe Cys Thr He Leu Leu Leu Leu Trp Val Ser Val 

11 



60 
120 
180 
231 



279 



327 
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10 










15 










20 










25 






ttc 


etc 


tat 


ggc 


tec 


ttc 


tac 


tat 


tec 


tat 


atg 


ccg 


aca 


gtc 


age 


cac 


375 




Phe 


Leu 


Tyr 


Gly 


Ser 


Phe 


Tyr 


Tyr 


Ser 


Tyr 


Met 


Pro 


Thr 


Val 


Ser 


His 














30 










35 










40 






5 


etc 


age 


cct 


gtg 


cat 


ttc 


tac 


tac 


agg 


acc 


gac 


tgt 


gat 


tec 


tec 


acc 


423 




Leu 


Ser 


Pro 


Val 


His 


Phe 


Tyr 


Tyr 


Arg 


Thr 


Asp 


Cys 


Asp 


Ser 


Ser 


Thr 












45 










50 










55 










acc 


tea 


etc 


tgc 


tec 


ttc 


cct 


gtt 


gee 


aat 


gtc 


teg 


ctg 


act 


aag 


ggt 


471 




Thr 


Ser 


Leu 


Cys 


Ser 


Phe 


Pro 


Val 


Ala 


Asn 


Val 


Ser 


Leu 


Thr 


Lys 


Gly 




10 






60 










65 










70 












gga 


cgt 


gat 


egg 


gtg 


ctg 


atg 


tat 


gga 


cag 


ccg 


tat 


cgt 


gtt 


acc 


tta 


519 




Gly 


Arg 


Asp 


Arg 


Val 


Leu 


Met 


Tyr 


Gly 


Gin 


Pro 


Tyr 


Arg 


Val 


Thr 


Leu 








75 










80 










85 














gag 


ctt 


gag 


ctg 


cca 


gag 


tec 


cct 


gtg 


aat 


caa 


gat 


ttg 


ggc 


atg 


ttc 


567 


15 


Glu 


Leu 


Glu 


Leu 


Pro 


Glu 


Ser 


Pro 


Val 


Asn 


Gin 


Asp 


Leu 


Gly 


Met 


Phe 






90 










95 










100 










105 






ttg 


gtc 


acc 


att 


tec 


tgc 


tac 


acc 


aga 


ggt 


ggc 


cga 


ate 


ate 


tec 


act 


615 




Leu 


Val 


Thr 


He 


Ser 


Cys 


Tyr 


Thr 


Arg 


Gly Gly 


Arg 


He 


He 


Ser 


Thr 














110 










115 










120 






20 


tct 


teg 


cgt 


teg 


gtg 


atg 


ctg 


cat 


tac 


cgc 


tea 


gac 


ctg 


etc 


cag 


atg 


663 




Ser 


Ser 


Arg 


Ser 


Val 


Met 


Leu 


His 


Tyr 


Arg 


Ser 


Asp 


Leu 


Leu 


Gin 


Met 












125 










130 










135 










ctg 


gac 


aca 


ctg 


gtc 


ttc 


tct 


age 


etc 


ctg 


eta 


ttt 


ggc 


ttt 


gca 


gag 


711 




Leu 


Asp 


Thr 


Leu 


Val 


Phe 


Ser 


Ser 


Leu 


Leu 


Leu 


Phe 


Gly 


Phe 


Ala 


Glu 




25 






140 










145 










150 












cag 


aag 


cag 


ctg 


ctg 


gag 


gtg 


gaa 


etc 


tac 


gca 


gac 


tat 


aga 


gag 


aac 


759 




Gin 


Lys 


Gin 


Leu 


Leu 


Glu 


Val 


Glu 


Leu 


Tyr 


Ala 


Asp 


Tyr 


Arg 


Glu 


Asn 








155 










160 










165 














teg 


gtg 


agt 


gag 


tac 


gtg 


ccg 


acc 


act 


gga 


gcg 


ate 


att 


gag 


ate 


cac 


807 


30 


Ser 


Val 


Ser 


Glu 


Tyr 


val 


Pro 


Thr 


Thr 


Gly Ala 


He 


He 


Glu 


He 


His 






170 










175 










180 










185 






age 


aag 


cgc 


ate 


cag 


ctg 


tat 


gga 


gee 


tac 


etc 


cgc 


ate 


cac 


gcg 


cac 


855 




Ser 


Lys 


Arg 


He 


Gin 


Leu 


Tyr 


Gly 


Ala 


Tyr 


Leu 


Arg 


He 


His 


Ala 


His 














190 










195 










200 






35 


ttc 


act 


ggg 


etc 


aga 


tac 


ctg 


eta 


tac 


aac 


ttc 


ccg 


atg 


acc 


tgc 


gec 


903 




Phe 


Thr 


Gly 


Leu 


Arg 


Tyr 


Leu 


Leu 


Tyr 


Asn 


Phe 


Pro 


Met 


Thr 


Cys 


Ala 












205 










210 










215 










ttc 


ata 


ggt 


gtt 


gee 


age 


aac 


ttc 


acc 


ttc 


etc 


age 


gtc 


ate 


gtg 


etc 


951 




Phe 


lie 


Gly 


Val 


Ala 


Ser 


Asn 


Phe 


Thr 


Phe 


Leu 


Ser 


Val 


He 


Val 


Leu 




40 






220 










225 










230 












ttc 


age 


tac 


atg 


cag 


tgg 


gtg 


tgg 


ggg 


ggc 


ate 


tgg 


ccc 


cga 


cac 


cgc 


999 




Phe 


Ser 


Tyr 


Met 


Gin 


Trp 


Val 


Trp 


Gly 


Gly 


He 


Trp 


Pro 


Arg 


His 


Arg 








235 










240 










245 














ttc 


tct 


ttg 


cag 


gtt 


aac 


ate 


cga 


aaa 


aga 


gac 


aat 


tec 


egg 


aag 


gaa 


1047 


45 


Phe 


Ser 


Leu 


Gin 


Val 


Asn 


He 


Arg 


Lys 


Arg 


Asp 


Asn 


Ser 


Arg 


Lys 


Glu 






250 










255 










260 










265 






gtc 


caa 


cga 


agg 


ate 


tct 


get 


cat 


cag 


cca 


ggt 


gca 


ggg 


cct 


gaa 


ggc 


1095 




Val 


Gin 


Arg 


Arg 


He 


Ser 


Ala 


His 


Gin 


Pro 


Gly 


Ala 


Gly 


Pro 


Glu 


Gly 














270 










275 










280 






50 


cag 


gag 


gag 


tea 


act 


ccg 


caa 


tea 


gat 


gtt 


aca 


gag 


gat 


ggt 


gag 


age 


1143 




Gin 


Glu 


Glu 


Ser 


Thr 


Pro 


Gin 


Ser 


Asp 


Val 


Thr 


Glu 


Asp 


Gly 


Glu 


Ser 












285 










290 










295 










cct 


gaa 


gat 


ccc 


tea 


ggg 


aca 


gag 


ggt 


cag 


ctg 


tec 


gag 


gag 


gag 


aaa 


1191 




Pro 


Glu 


Asp 


Pro 


Ser 


Gly 


Thr 


Glu 


Gly 


Gin 


Leu 


Ser 


Glu 


Glu 


Glu 


Lys 




55 






300 










305 










310 












cca 


gat 


cag 


cag 


ccc 


ctg 


age 


gga 


gaa 


gag 


gag 


eta 


gag 


cct 


gag 


gee 


1239 




Pro 


Asp 


Gin 


Gin 


Pro 


Leu 


Ser 


Gly 


Glu 


Glu 


Glu 


Leu 


Glu 


Pro 


Glu 


Ala 








315 










320 










325 














agt 


gat 


ggt 


tea 


ggc 


tec 


tgg 


gaa 


gat 


gca 


get 


ttg 


ctg 


acg 


gag 


gee 


1287 


60 


Ser 


Asp 


Gly 


Ser 


Gly 


Ser 


Trp 


Glu 


Asp 


Ala 


Ala 


Leu 


Leu 


Thr 


Glu 


Ala 






330 










335 










340 










345 






aac 


ctg 


cct 


get 


cct 


get 


cct 


get 


tct 


get 


tct 


gee 


cct 


gtc 


eta 


gag 


1335 




Asn 


Leu 


Pro 


Ala 


Pro 


Ala 


Pro 


Ala 


Ser 


Ala 


Ser 


Ala 


Pro 


Val 


Leu 


Glu 





12 
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10 



15 



350 355 360 

act ctg ggc age tct gaa cct get ggg ggt get etc cga cag cgc ccc 1383 
Thr Leu Gly Ser Ser Glu Pro Ala Gly Gly Ala Leu Arg Gin Arg Pro 

365 370 375 

ace tgc tct agt tec tgaagaaaag gggcagactc ctcacattcc agcactttcc 143 8 

Thr Cys Ser Ser Ser 
380 

cacctgactc ctctcccctc gtttttcctt caataaacta ttttgtgtca gctccaaaaa 1498 
aaaaaaaaaa aa 1510 

<210> 9 

<211> 882 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 78 . , 



410 



20 <220> 

<221> sig_peptide 

<222> 78 . . 155 

<223> Von Heijne matrix 

score 10.0731536331164 
25 seq LWLALVSCILTQA/SA 



<400> 9 

atggctggcc agaggaggaa cgctttgtgt tetcategga gctgcatggg aagtctgeat 
acagcaaagt gaectge atg cct cac ctt atg gaa agg atg gtg ggc tct 

30 Met Pro His Leu Met Glu Arg Met Val Gly Ser 

-25 -20 
ggc etc ctg tgg ctg gee ttg gtc tec tgc att ctg ace cag gca tct 
Gly Leu Leu Trp Leu Ala Leu Val Ser Cys lie Leu Thr Gin Ala Ser 
-15 -10 -5 1 

35 gca gtg cag cga ggt tat gga aac ccc att gaa gee agt teg tat ggg 
Ala Val Gin Arg Gly Tyr Gly Asn Pro lie Glu Ala Ser Ser Tyr Gly 

5 10 15 

ctg gac ctg gac tgc gga get cct ggc ace cca gag get cat gtc tgt 
Leu Asp Leu Asp Cys Gly Ala Pro Gly Thr Pro Glu Ala His Val Cys 

40 2 0 2 5 3 0 

ttt gac ccc tgt cag aat tac acc etc eta gat ttg ggg ccc ate act 
Phe Asp Pro Cys Gin Asn Tyr Thr Leu Leu Asp Leu Gly Pro lie Thr 

35 40 45 

egg aga ggt gca cag tct ccc ggt gtc atg aat gga acc cct age act 

45 Arg Arg Gly Ala Gin Ser Pro Gly Val Met Asn Gly Thr Pro Ser Thr 
50 55 60 65 

gca ggg ttc ctg gtg gee tgg cct atg gtc etc ctg act gtc etc ctg 
Ala Gly Phe Leu Val Ala Trp Pro Met Val Leu Leu Thr Val Leu Leu 
70 75 80 

50 get tgg ctg ttc tgagagctcc gctgagcatc tggccttgaa gtttgtgttc 
Ala Trp Leu Phe 
85 

ttccctctgg caatggctcc cttcagcact tctgctttcc actccaattc acacaggctt 
ggtattaaca gaatcaaggc caggctaggt taggaaaagg gaagagcttt caccttcttt 

55 aaaactctcg getgggegea gtggctcatg cctgtaatcc cagcattttg ggaggctgag 
gcaggtggat cacctgaggt cagcagttca aaatcagect ggccaaaatg ctgaaactcc 
gtctctacta aaaatacaaa aattagccag gcatggtgac aggegectgt aatcccagct 
actegggagg ccaaggcagg agaattgetc gaactcaggg ggtggaggtt gcagtgagtt 
gagattgtgc cattgcactc cagcctgggc aacagagcaa gactctgtct caggcaaaaa 

60 aaaaaaaaaa aa 



60 
110 



158 



206 



254 



302 



350 



398 



450 



510 
570 
630 
690 
750 
810 
870 
882 



<210> 10 
<211> 1849 



13 



BNSDOCJD: <WO 0142451A2J_> 



WO 01/42451 



PCT/1B00/01938 



<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 84 . .299 



<220> 

<221> sig_peptide 
10 <222> 84 . .134 

<223> Von Heijne matrix 

score 3.86022363031904 

seq GFVAALVAGGVAG/VS 

15 <400> 10 

aaacatggcg gcgcccagcg cgcgaggacg tgatccgctt ctgctccggc ttggattgta 60 

gccttgacga ggtctgagcg acc atg gac egg ccg ggg ttc gtg gca gcg ctg 113 

Met Asp Arg Pro Gly Phe Val Ala Ala Leu 
-15 -10 

20 gtg get ggt ggg gta gca ggt gtt tct gtt gac ttg ata tta ttt cct 161 
Val Ala Gly Gly Val Ala Gly Val Ser Val Asp Leu lie Leu Phe Pro 

-5 15 

ctg gat acc att aaa acc agg ctg cag agt ccc caa gga ttt aat aag 209 
Leu Asp Thr lie Lys Thr Arg Leu Gin Ser Pro Gin Gly Phe Asn Lys 

25 10 15 20 25 

get ggt ggt ttt cat gga ata tat get ggc gtt cct tct get get att 257 
Ala Gly Gly Phe His Gly lie Tyr Ala Gly Val Pro Ser Ala Ala lie 

30 35 40 

gga tec ttt cct aat ggt tgc ctg cct gat teg agt tec ate 299 

30 Gly Ser Phe Pro Asn Gly Cys Leu Pro Asp Ser Ser Ser lie 
45 50 55 

tgaagtggtt aagcagaggg cacaggtatc tgcttctaca agaacatttc agattttctc 359 

taacatctta tatgaagagg gtatccaagg gttgtatcga ggctataaaa gcacagtttt 419 

aagagagatt cctttttctt tggtccagtt tcccttatgg gagtccttaa aagccctctg 479 

35 gtcctggagg caggatcatg tggtggattc ttggcagtca gcagtctgtg gagcttttgc 539 

aggtggattt gccgctgcag tcaccacccc tetagaegtg gcgaagacaa gaattatget 599 

ggcaaaggct ggctccagca ctgctgatgg gaatgtgctc tctgtcctgc atggggtctg 659 

gcggtcacag gggctggcag gattatttgc aggtgtcttc cctcgaatgg cagccatcag 719 

tctgggaggt ttcatctttc tgggggctta tgaccgaacg cacagcttgc tgttggaagt 779 

40 tggcagaaag agtccttgaa gcagagacaa gcctcacctc cacttctgtc aagagagggg 83 9 

cctgcagtgt aaaccctctt ccgctgagca gctgtctgaa ctataggccc cagtgctgaa 899 

gaccagttgt gctaagatac eggcatggag attgtgccat ccgtggtata ggctggctgg 959 

tatgaagtca ttggcctgta tgecagagag ctaagagaag aaaacggggt ctgtggcagt 1019 

actctgaaca atttcctcag aacctcttaa taaataagtt tggtaatget gaggecagge 1079 

45 cttttagagc tttcatttga tctgtatctg atctttcatt tcctgccacc tgatggtgga 1139 

ttcagcagaa ggcaagatgg ttataattct aaaagaatag cttgtttgtt tgtttgtttg 1199 

ggaaaaggag acttggggaa gagttgtgta tgtgggtgtt tctcccccta gttaattcct 1259 

gttgtgtaag ggtaggcttt gttgaaaaag aaagaaagat tgaactacag gtgeatagea 1319 

agcactcttt ctgggtaact aggctgetgg ttttaattac cctcagattt cacccataaa 1379 

50 aacgeacaat tgtattattt tacagagatg tgtccagcgc cccctgtggt gtgtgagaga 1439 

aagcagctgc aactcaagtg actaggtggg cccagctggc ttcgtgcagg agggcaeggt 1499 

gggtgagcca ttctcgccat tctcatgtca gactgaaagg agggcctggg ccagctttga 1559 

aaaggcagga tgaaatggaa aggtcaccac acttagggat tttagacctt gactaacaag 1619 

ctccaggtgt agaaaaattc aaaacaaaat gtcaggaatc tagcagtgtt gtctgccctg 1679 

55 gagcaaacaa acagtatgtg attttgette gectattttt tttttctttt ttgggggaag 1739 

ataattaaag gcagaatgac tgcgtttgta aaagaaggac caccaactat actgacattt 1799 

ataaatgaac ctttattaaa gacacttcaa tgcaaaaaaa aaaaaaaaaa 1849 

<210> 11 
60 <211> 565 
<212> DNA 

<213> Homo sapiens 
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<220> 
<221> CDS 
<222> 55 . .468 

5 <220> 

<221> sig_peptide 

<222> 55 . . 99 

<223> Von Heijne matrix 

score 8.96936032049195 
10 seq FTTLLFLAAVAGA/LV 

<400> 11 

attccccaga ccttctgcag attctgtggt tatactcact cctcatccca aaga atg 57 

Met 

15 -15 

aaa ttt acc act etc etc ttc ttg gca get gta gca ggg gee ctg gtc 105 
Lys Phe Thr Thr Leu Leu Phe Leu Ala Ala Val Ala Gly Ala Leu Val 

-10 -5 1 

tat get gaa gat gee tec tct gac teg acg ggt get gat cct gee cag 153 
20 Tyr Ala Glu Asp Ala Ser Ser Asp Ser Thr Gly Ala Asp Pro Ala Gin 
5 10 15 

gaa get ggg acc tct aag cct aat gaa gag ate tea ggt cca gca gaa 201 
Glu Ala Gly Thr Ser Lys Pro Asn Glu Glu lie Ser Gly Pro Ala Glu 
20 25 30 

25 cca get tea ccc cca gag aca acc aca aca gee cag gag act teg gcg 249 
Pro Ala Ser Pro Pro Glu Thr Thr Thr Thr Ala Gin Glu Thr Ser Ala 
35 40 45 50 

gca gca gtt cag ggg aca gee aag gtc acc tea age agg cag gaa eta 297 
Ala Ala Val Gin Gly Thr Ala Lys Val Thr Ser Ser Arg Gin Glu Leu 
30 55 60 65 

aac ccc ctg aaa tec ata gtg gag aaa agt ate tta eta aca gaa caa 345 
Asn Pro Leu Lys Ser He Val Glu Lys Ser He Leu Leu Thr Glu Gin 

70 75 80 

gee ctt gca aaa gca gga aaa gga atg cac gga ggc gtg cca ggt gga 393 
35 Ala Leu Ala Lys Ala Gly Lys Gly Met His Gly Gly Val Pro Gly Gly 
85 90 95 

aaa caa ttc ate gaa aat gga agt gaa ttt gca caa aaa tta ctg aag 441 
Lys Gin Phe He Glu Asn Gly Ser Glu Phe Ala Gin Lys Leu Leu Lys 
100 105 110 

40 aaa ttc agt eta tta aaa cca tgg gca tgagaagctg aataatggga 488 
Lys Phe Ser Leu Leu Lys Pro Trp Ala 
115 120 

tcattggact taaagectta aatacccttg tageccagag ctattaaaac gaaagcatcc 548 
aaaaaaaaaa aaaaaaa 565 

45 

<210> 12 
<211> 1663 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 152 . .475 

55 <220> 

<221> sig_jpeptide 

<222> 152 . .244 

<223> Von Heijne matrix 

score 10.0910253445132 

60 seq lvlllvtrspvna/cl 

<400> 12 

atgtgtctgc tgccgccatt gtgeggeget ggtcccctca gagggttcct gctgctgccg 60 



50 



15 
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gtgccttgga ccctccccct cgcttctcgt tctactgccc caggagcccg gcgggtccgg 
gactcccgtc cgtgccggtg cgggcgccgg c atg tgg ctg tgg gag gac cag 

Met Trp Leu Trp Glu Asp Gin 
-30 -25 
5 ggc ggc etc ctg ggc cct ttc tec ttc ctg ctg eta gtg ctg ctg ctg 
Gly Gly Leu Leu Gly Pro Phe Ser Phe Leu Leu Leu Val Leu Leu Leu 

-20 -15 -10 

gtg acg egg age ccg gtc aat gcc tgc etc etc acc ggc age etc ttc 
Val Thr Arg Ser Pro Val Asn Ala Cys Leu Leu Thr Gly Ser Leu Phe 
10 -5 15 

gtt eta ctg cgc gtc ttc age ttt gag ccg gtg ccc tct tgc agg gcc 
Val Leu Leu Arg Val Phe Ser Phe Glu Pro Val Pro Ser Cys Arg Ala 

10 15 20 

ctg cag gtg etc aag ccc egg gac cgc att tct gcc ate gcc cac cgt 
15 Leu Gin Val Leu Lys Pro Arg Asp Arg lie Ser Ala lie Ala His Arg 
25 30 35 40 

ggc ggc age aam sag gcg ccc gag aac acg ctg gcg gcc att egg cag 
Gly Gly Ser Xaa Xaa Ala Pro Glu Asn Thr Leu Ala Ala lie Arg Gin 
45 50 55 

20 eta aga atg gag caa cag gcg tgg agt tgg aca ttg agt tta ctt ctg 
Leu Arg Met Glu Gin Gin Ala Trp Ser Trp Thr Leu Ser Leu Leu Leu 

60 65 70 

acg gga ttc ctg tct taatgeaega taacacagta gataggacga ctgatgggac 
Thr Gly Phe Leu Ser 
25 75 

tgggcgattg tgtgatttga catttgaaca aattaggaag ctgaatcctg cagcaaacca 
cagactcagg aatgatttcc ctgatgaaaa gatccctacc ctaagggaag ctgttgcaga 
gtgectaaac cataacctca caatcttctt tgatgtcaaa ggccatgcac acaaggctac 
tgaggctcta aagaaaatgt atatggaatt tcctcaactg tataataata gtgtggtctg 

30 ttctttcttg ccagaagtta tctacaaggt aacatteggg atttttcttg tacatattag 
atgagacaaa cagateggga tgtaataaca gcattaactc acagaccttg gagectaage 
catacaggag atgggaaacc aegctatgat actttctgga aacattttat atttgttatg 
atggacattt tgctcgattg gagcatgeat aatatcttgt ggtacctgtg tggaatttca 
gctttcctca tgcaaaagga ttttgtatcc ccggcctact tgaagaagtg gtcagctaaa 

35 ggaatccagg ttgttggttg gactgttaat acctttgatg aaaagagtta ctacgaatcc 
catcttggtt ccagctatat cactgacagc atggtagaag actgcgaacc tcacttctag 
actttcaegg tgggacgaaa egggttcaga aactgecagg ggcctcatac agggatatca 
aaataccctt tgtgctagcc caggccctgg ggaatcaggt gactcacaca aatgeaatag 
ttggtcactg catttttacc tgaaccaaag ctaaacccgg tgttgccacc atgcaccatg 

40 geatgecaga gttcaacact gttgctcttg aaaatctggg tctgaaaaaa cgcacaagag 
cccctgccct gccctagctg aggcacacag ggagacccag tgaggataag cacagattga 
attgtacaat ttgcagatgc agatgtaaat gcatgggaca tgcatgataa ctcagagttg 
acattttaaa acttgccaca cttatttcaa atatttgtac tcagctatgt taacatgtac 
tgtagacatc aaacttgtgg ccatactaat aaaattatta aaaggagcac taaaaaaaaa 

45 aaaaaaaa 



120 
172 



220 



268 



316 



364 



412 



460 



515 



575 
635 
695 
755 
815 
875 
935 
995 
1055 
1115 
1175 
1235 
1295 
1355 
1415 
1475 
1535 
1595 
1655 
1663 



<210> 13 
<211> 744 
<212> DNA 
50 <213> Homo sapiens 



55 



60 



<220> 
<221> CDS 
<222> 112. 



552 



<220> 

<221> sig_peptide 

<222> 112 . . 183 

<223> Von Heijne matrix 

score 11.7298925418815 
seq FVLGLGLTPPTLA/QD 



<400> 13 
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-10 



17 



213 



tcacaactgg aacccatctc caggaacaaa cagctggaac ccatctcccg ttgaagggaa 60 
actgccagat ttttgtaaga ttcttcctcc tgggagcctg tgttggaaga g atg gtg 117 

Met Val 

atg ggc ctg ggc gtt ttg ttg ttg gtc ttc gtg ctg ggt ctg ggt ctg 165 
5 Met Gly Leu Gly Val Leu Leu Leu Val Phe Val Leu Gly Leu Gly Leu 
-20 -15 -10 

acc cca ccg acc ctg get cag gat aac tec agg tac aca cac ttc ctg 
Thr Pro Pro Thr Leu Ala Gin Asp Asn Ser Arg Tyr Thr His Phe Leu 
-5 15 10 

10 acc cag cac tat gat gee aaa cca cag ggc egg gat gac aga tac tgt 261 
Thr Gin His Tyr Asp Ala Lys Pro Gin Gly Arg Asp Asp Arg Tyr Cys 

15 20 25 

gaa age ate atg agg aga egg ggc ctg acc tea ccc tgc aaa gac ate 309 
Glu Ser lie Met Arg Arg Arg Gly Leu Thr Ser Pro Cys Lys Asp He 
15 30 35 40 

aac aca ttt att cat ggc aac aag cgc acg ate aag gee ate tgt gaa 357 
Asn Thr Phe He His Gly Asn Lys Arg Thr He Lys Ala He Cys Glu 

45 50 55 

aac aag aat gga aac cct cac aga gaa aac eta aga ata age aag tct 405 
20 Asn Lys Asn Gly Asn Pro His Arg Glu Asn Leu Arg He Ser Lys Ser 
60 65 70 

tct ttc cag gtc acc act tgc aag eta cat gga ggt tec ccc tgg cct 453 
Ser Phe Gin Val Thr Thr Cys Lys Leu His Gly Gly Ser Pro Trp Pro 
75 80 85 90 

25 cca tgc cag tac cga gee aca gcg ggg ttc aga aac gtt gtt gtt get 501 
Pro Cys Gin Tyr Arg Ala Thr Ala Gly Phe Arg Asn Val Val Val Ala 

95 100 105 

tgt gaa aat ggc tta cct gtc cac ttg gat cag tea att ttc cgt cgt 54 9 

Cys Glu Asn Gly Leu Pro Val His Leu Asp Gin Ser He Phe Arg Arg 
30 110 115 120 

ccg taaccagegg gcccctggtc aagtgctggc tctgctgtcc ttgccttcca 602 
Pro 

tttcccctct gcacccagaa cagtggtggc aacattcatt gccaagggcc caaagaaaga 662 
gctacctgga ccttttgttt tctgtttgac aacatgttta ataaataaaa atgtcttgat 722 
35 atcagcaaaa aaaaaaaaaa aa 744 

<210> 14 
<211> 1759 
<212> DNA 
40 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 101 . . 1243 

45 

<220> 

<221> sig_peptide 
<222> 101 . . 199 
<223> Von Heijne matrix 
50 score 3.57142340200611 

seq FLCLGMALCPRQA/TR 

<400> 14 

gtagagtgct gaaggtcctg ccaacggctc tcttggcgtc teaaegtteg gatcagcagc 60 
55 ttttttccat tctctctctc cacttcttca gtgagcagee atg agt tgg act gtg 115 

Met Ser Trp Thr Val 
-30 

cct gtt gtg egg gee age cag aga gtg age teg gtg gga gcg aat ttc 163 
Pro Val Val Arg Ala Ser Gin Arg Val Ser Ser Val Gly Ala Asn Phe 
60 -25 -20 -15 

eta tgc ctg ggg atg gee ctg tgt ccg cgt caa gca acg cgc ate ccg 211 
Leu Cys Leu Gly Met Ala Leu Cys Pro Arg Gin Ala Thr Arg He Pro 
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etc 


aac 


ggc 


acc 


tgg 


etc 


ttc 


acc 


ccc 


gtg 


age 


aag 


atg 


gcg 


act 


gtg 


259 




Leu 


Asn 


Gly 


Thr 


Trp 


Leu 


Phe 


Thr 


Pro 


Val 


Ser 


Lys 


Met 


Ala 


Thr 


val 






5 










10 










15 










20 






aag 


agt 


gag 


ctt 


att 


gag 


cgt 


ttc 


act 


tec 


gag 


aag 


ccc 


gtt 


cat 


cac 


307 


5 


Lys 


Ser 


Glu 


Leu 


He 


Glu 


Arg 


Phe 


Thr 


Ser 


Glu 


Lys 


Pro 


Val 


His 


His 














25 










30 










35 








agt 


aag 


gtc 


tec 


ate 


ata 


gga 


act 


gga 


teg 


gtg 


ggc 


atg 


gee 


tgc 


get 


355 




Ser 


Lys 


Val 


Ser 


He 


He 


Gly 


Thr 


Gly 


Ser 


val 


Gly 


Met 


Ala 


Cys 


Ala 












40 










45 










50 








10 


ate 


age 


ate 


tta 


tta 


aaa 


ggc 


ttg 


agt 


gat 


gaa 


ctt 


gee 


ctt 


gtg 


gat 


403 




lie 


Ser 


He 


Leu 


Leu 


Lys 


Gly 


Leu 


Ser 


Asp 


Glu 


Leu 


Ala 


Leu 


Val 


Asp 










55 










60 










65 












ctt 


gat 


gaa 


gac 


aaa 


ctg 


aag 


ggt 


gag 


acg 


atg 


gat 


ctt 


caa 


cat 


ggc 


451 




Leu 


Asp 


Glu 


Asp 


Lys 


Leu 


Lys 


Gly 


Glu 


Thr 


Met 


Asp 


Leu 


Gin 


His 


Gly 




15 




70 










75 










80 














age 


cct 


ttc 


acg 


aaa 


atg 


cca 


aat 


att 


gtt 


tgt 


age 


aaa 


gat 


tac 


ttt 


499 




Ser 


Pro 


Phe 


Thr 


Lys 


Met 


Pro 


Asn 


He 


Val 


Cys 


Ser 


Lys 


Asp 


Tyr 


Phe 






85 










90 










95 










100 






gtc 


aca 


gca 


aac 


tec 


aac 


eta 


gtg 


att 


ate 


aca 


gca 


ggt 


gca 


cgc 


caa 


547 


20 


Val 


Thr 


Ala 


Asn 


Ser 


Asn 


Leu 


Val 


He 


He 


Thr 


Ala 


Gly 


Ala 


Arg 


Gin 














105 










110 










115 








gaa 


aag 


99 a 


gaa 


acg 


cgc 


ctt 


aat 


tta 


gtc 


cag 


cga 


aat 


gtg 


gee 


ate 


595 




Glu 


Lys 


Gly 


Glu 


Thr 


Arg 


Leu 


Asn 


Leu 


Val 


Gin 


Arg 


Asn 


Val 


Ala 


He 












120 










125 










130 








25 


ttc 


aag 


tta 


atg 


att 


tec 


agt 


att 


gtc 


cag 


tac 


age 


ccc 


cac 


tgc 


aaa 


643 




Phe 


Lys 


Leu 


Met 


He 


Ser 


Ser 


He 


Val 


Gin 


Tyr 


Ser 


Pro 


His 


Cys 


Lys 










135 










140 










145 












ctg 


att 


att 


gtt 


tec 


aat 


cca 


gtg 


gat 


ate 


tta 


act 


tat 


gta 


get 


tgg 


691 




Leu 


He 


He 


Val 


Ser 


Asn 


Pro 


Val 


Asp 


He 


Leu 


Thr 


Tyr 


Val 


Ala 


Trp 




30 




150 










155 










160 














aag 


ttg 


agt 


gca 


ttt 


ccc 


aaa 


aac 


cgt 


att 


att 


gga 


age 


ggc 


tgt 


aat 


739 




Lys 


Leu 


Ser 


Ala 


Phe 


Pro 


Lys 


Asn 


Arg 


He 


He 


Gly 


Ser 


Gly 


Cys 


Asn 






165 










170 










175 










180 






ctg 


gat 


act 


get 


cgt 


ttt 


cgt 


ttc 


ttg 


att 


gga 


caa 


aag 


ctt 


ggt 


ate 


787 


35 


Leu 


Asp 


Thr 


Ala 


Arg 


Phe 


Arg 


Phe 


Leu 


He 


Gly 


Gin 


Lys 


Leu 


Gly 


He 














185 










190 










195 








cat 


tct 


gaa 


age 


tgc 


cat 


gga 


tgg 


ate 


etc 


gga 


gag 


cat 


gga 


gac 


tea 


835 




His 


Ser 


Glu 


Ser 


Cys 


His 


Gly 


Trp 


He 


Leu 


Gly 


Glu 


His 


Gly 


Asp 


Ser 












200 










205 










210 








40 


agt 


gtt 


cct 


gtg 


tgg 


agt 


gga 


gtg 


aac 


ata 


get 


ggt 


gtc 


cct 


ttg 


aag 


883 




Ser 


Val 


Pro 


Val 


Trp 


Ser 


Gly 


Val 


Asn 


He 


Ala 


Gly 


Val 


Pro 


Leu 


Lys 










215 










220 










225 












gat 


ctg 


aac 


tct 


gat 


ata 


gga 


act 


gat 


aaa 


gat 


cct 


gag 


caa 


tgg 


aaa 


931 




Asp 


Leu 


Asn 


Ser 


Asp 


He 


Gly 


Thr 


Asp 


Lys 


Asp 


Pro 


Glu 


Gin 


Trp 


Lys 




45 




230 










235 










240 














aat 


gtc 


cac 


aaa 


gaa 


gtg 


act 


gca 


act 


gee 


tat 


gag 


att 


att 


aaa 


atg 


979 




Asn 


val 


His 


Lys 


Glu 


Val 


Thr 


Ala 


Thr 


Ala 


Tyr 


Glu 


He 


He 


Lys 


Met 






245 










250 










255 










260 






aaa 


ggt 


tat 


act 


tct 


tgg 


gee 


att 


ggc 


eta 


tct 


gtg 


gec 


gat 


tta 


aca 


1027 


50 


Lys 


Gly 


Tyr 


Thr 


Ser 


Trp 


Ala 


He 


Gly 


Leu 


Ser 


Val 


Ala 


Asp 


Leu 


Thr 














265 










270 










275 








gaa 


agt 


att 


ttg 


aag 


aat 


ctt 


agg 


aga 


ata 


cat 


cca 


gtt 


tec 


acc 


ata 


1075 




Glu 


Ser 


He 


Leu 


Lys 


Asn 


Leu 


Arg 


Arg 


He 


His 


Pro 


Val 


Ser 


Thr 


He 












280 










285 










290 








55 


att 


aag 


ggc 


etc 


tat 


gga 


ata 


gat 


gaa 


gaa 


gta 


ttc 


etc 


agt 


att 


cct 


1123 




lie 


Lys 


Gly 


Leu 


Tyr 


Gly 


He 


Asp 


Glu 


Glu 


Val 


Phe 


Leu 


Ser 


He 


Pro 










295 










300 










305 












tgt 


ate 


ctg 


gga 


gag 


aac 


ggt 


att 


acc 


aac 


ctt 


ata 


aag 


ata 


aag 


ctg 


1171 




Cys 


He 


Leu 


Gly 


Glu 


Asn 


Gly 


He 


Thr 


Asn 


Leu 


He 


Lys 


He 


Lys 


Leu 




60 




310 










315 










320 














acc 


cct 


gaa 


gaa 


gag 


gee 


cat 


ctg 


aaa 


aaa 


agt 


gca 


aaa 


aca 


etc 


tgg 


1219 




Thr 


Pro 


Glu 


Glu 


Glu 


Ala 


His 


Leu 


Lys 


Lys 


Ser 


Ala 


Lys 


Thr 


Leu 


Trp 






325 










330 










335 










340 
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10 



gaa att cag 
Glu He Gin 

tattgaagag 
aaaagatgga 
ttttattgag 
cctctgatgt 
agtaggatgt 
tatttctttc 
agcttcttct 
cttttgtttt 
aaaaaa 



aat aag ctt aag ctt taaagttgcc taaaactacc attccgaaat 
Asn Lys Leu Lys Leu 
345 

atcatagata caggattata taacgaaatt ttgaataaac ttgaattcct 
aacaggaaag taggtagagt gattttccta tttatttagt cctccagctc 
catccacgtg ctggacgata cttatttaca attcctaagt atttttggta 
agcagcactt gccatgttat atatatgtag ttggcatttg gttcccaaaa 
aggtatttat tgtgttctag aaattccgac tcttttcatt agatatatgc 
attcttgctg gtttatacct atgttcattt atatgctgta aaaaagtagt 
acaatgtaaa aataaatgta catacaaaaa aatgcagtag tatatacaat 
gcttcctttg atagttaata aattccgttt gttgaatcaa taaaaaaaaa 



1273 



1333 
1393 
1453 
1513 
1573 
1633 
1693 
1753 
1759 



<210> 15 
15 <211> 1755 
<212> DNA 
<213> Homo sapiens 



<220> 
20 <221> CDS 
<222> 101. 



.517 



<220> 

<221> sig_peptide 
25 <222> 101 . . 199 

<223> Von Heijne matrix 

score 3.57613483592743 

seq FLCLGMALCLRQA/TR 

30 <400> 15 

gtagagtgct gaaggtcctg ccaacggctc tcttggcgtc tcaacgttcg gatcagcagc 60 
ttttttccat tctctctctc cacttcttca gtgagcagcc atg agt tgg act gtg 115 

Met Ser Trp Thr Val 
-30 

35 cct gtt gtg egg gec age cag aga atg age teg gtg gga gcg aat ttc 163 
Pro Val Val Arg Ala Ser Gin Arg Met Ser Ser Val Gly Ala Asn Phe 

-25 -20 -15 

eta tgc ctg ggg atg gec ctg tgt ctg cgt caa gca acg cgc ate ccg 211 
Leu Cys Leu Gly Met Ala Leu Cys Leu Arg Gin Ala Thr Arg He Pro 

40 -10 -5 1 

etc aac ggc acc tgg etc ttc aca ccc gtg age aag atg gcg act gtg 259 

Leu Asn Gly Thr Trp Leu Phe Thr Pro Val Ser Lys Met Ala Thr Val 

5 10 15 20 

aag agt gag ctt att gag cgt ttc act tec gag aag ccc gtt cat cac 307 

45 Lys Ser Glu Leu He Glu Arg Phe Thr Ser Glu Lys Pro Val His His 

25 30 35 

agt aag gtc tec ate ata gga act gga teg gtg ggc atg gee tgc get 355 
Ser Lys Val Ser He He Gly Thr Gly Ser Val Gly Met Ala Cys Ala 
40 45 50 

50 ate age ate ttg tta aaa ggc ttg agt gat gaa ctt gec ctt gtg gat 403 
He Ser He Leu Leu Lys Gly Leu Ser Asp Glu Leu Ala Leu Val Asp 

55 60 65 

ctt gat gaa gac aaa ctg aag ggt gag acg atg gat ctt caa cat ggc 451 
Leu Asp Glu Asp Lys Leu Lys Gly Glu Thr Met Asp Leu Gin His Gly 

55 70 75 80 

age cct ttc acg aaa atg cca ata ttg ttt gta gca aag att act ttg 499 

Ser Pro Phe Thr Lys Met Pro He Leu Phe Val Ala Lys He Thr Leu 

85 90 95 100 

tea cag caa act cca acc tagtgattat cacagcaggt gcacgccaag 54 7 

60 Ser Gin Gin Thr Pro Thr 

105 

aaaagggaga aacgcgcctt aatttagtcc agegaaatgt ggccatcttc aagtaatgat 607 
ttccagtatt gtccagtaca gcccccactg caaactgatt attgtttcca atccagtgga 667 
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10 



15 



20 



25 



tatcttaact tatgtagctt 
cggctgtaat ctggatactg 
ttctgaaagc tgccatggat 
gagtggagtg aacatagctg 
taaagatcct gagcaggaaa 
ttaaaatgaa aggttatact 
gtattttgaa gaatcttagg 
gaatagatga agaagtattc 
accttataaa gataaagctg 
cactctggga aattcagaat 
attattgaag agatcataga 
ctaaaagatg gaaacaggaa 
tcttttattg agcatccacg 
tacctctgat gtagcagcac 
aaagtaggat gtaggtattt 
gctatttctt tcattcttgc 
gtagcttctt ctacaatgta 
atcttttgtt ttgcttcctt 
aaaaaaaa 

<210> 16 
<211> 936 
<212> DNA 

<213> Homo sapiens 



30 



35 



40 



45 



50 



55 



60 



ggaagttgag 
ctcgttttcg 
ggatcctcgg 
gtgtcccttt 
aatgtccaca 
tcttgggcca 
agaatacatc 
ctcagtattc 
acccctgaag 
aagcttaagc 
tacaggatta 
agtaggtaga 
tgctggacga 
ttgccatgtt 
attgtgttct 
tggtttatac 
aaaataaatg 
tgatagttaa 



tgcatttccc 
tttcttgatt 
agagcatgga 
gaaggatctg 
aagaagtgac 
ttggcctatc 
cagtttccac 
cttgtatcct 
aagaggccca 
tttaaagttg 
tataacgaaa 
gtgattttcc 
tacttattta 
atatatatgt 
agaaattccg 
ctatgttcat 
tacatacaaa 
taaattccgt 



aaaaaccgta 
ggacaaaagc 
gactcaagtg 
aactctgata 
tgcaactgcc 
tgtggccgat 
cataactaag 
gggagagaac 
tctgaaaaaa 
cctaaaacta 
ttttgaataa 
tatttattta 
caattcctaa 
agttggcatt 
actcttttca 
ttatatgctg 
aaaatgcagt 
ttgttgaatc 



ttattggaag 
ttggtatcca 
ttcctgtgtg 
taggaactga 
tatgagatta 
ttaacagaaa 
ggcctctatg 
ggtattacca 
agtgcaaaaa 
ccattccgaa 
acttgaattc 
gtcctccagc 
gtatttttgg 
tggttcccaa 
ttagatatat 
taaaaaagta 
agtatataca 
aataaaaaaa 



<220> 
<221> CDS 
<222> 59. . 



853 



<220> 

<221> sig_peptide 

<222> 59 . . 100 

<223> Von Heijne matrix 

score 5.2402423806254 

seq NFILFIFIPGVFS/LK 

<400> 16 

agaaaggagg ctctgggtag acgcactaga ttactggata 
atg aat ttt ata ttg ttt att ttt ata cct gga 
Met Asn Phe lie Leu Phe lie Phe lie Pro Gly 



-10 



-5 



agt age act ttg aag cct act att gaa gca ttg 
Ser Ser Thr Leu Lys Pro Thr He Glu Ala Leu 

5 10 
tta aat gaa gat gtt aat aag cag gaa gaa aag 
Leu Asn Glu Asp Val Asn Lys Gin Glu Glu Lys 

20 25 
ccc aat tat get cct get aat gag aaa aat ggc 
Pro Asn Tyr Ala Pro Ala Asn Glu Lys Asn Gly 
35 40 45 

ata aaa caa tat gtg ttc aca aca caa aat cca 
He Lys Gin Tyr Val Phe Thr Thr Gin Asn Pro 

55 60 
gaa ata tct gtg aga gec aca act gac ctg aat 
Glu He Ser Val Arg Ala Thr Thr Asp Leu Asn 

70 75 
gga tea ace cca aac gtg cct gca ttt tgg aca 
Gly Ser Thr Pro Asn Val Pro Ala Phe Trp Thr 

85 90 
ata aat gga aca gca gtg gtc atg gat gat aaa 
He Asn Gly Thr Ala Val Val Met Asp Asp Lys 

100 105 
cca att cca gag tct gat gtg aat get aca cag 



aatcacttca atttccca 
gtt ttt tec tta aaa 
Val Phe Ser Leu Lys 
1 

cct aat gtg eta cct 
Pro Asn Val Leu Pro 
15 

aat gaa gat cat act 
Asn Glu Asp His Thr 
30 

aat tat tat aaa gat 
Asn Tyr Tyr Lys Asp 
50 

aat ggc act gag tct 
Asn Gly Thr Glu Ser 
65 

ttt get eta aaa aac 
Phe Ala Leu Lys Asn 
80 

atg tta get aaa get 
Met Leu Ala Lys Ala 
95 

gat caa tta ttt cac 
Asp Gin Leu Phe His 
110 

gga gaa aat cag cca 



727 
787 
847 
907 
967 
1027 
1087 
1147 
1207 
1267 
1327 
1387 
1447 
1507 
1567 
1627 
1687 
1747 
1755 



58 
106 



154 



202 



250 



298 



346 



394 



442 



490 
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Pro He Pro Glu Ser Asp Val Asn Ala Thr Gin Gly Glu Asn Gin Pro 

115 120 125 130 

gat eta gag gat ctg aag ate aaa ata atg ctg gga ate teg ttg atg 538 

Asp Leu Glu Asp Leu Lys He Lys He Met Leu Gly He Ser Leu Met 

5 135 140 145 

ace etc etc etc ttt gtg gtc etc ttg gca ttc tgt agt get aca ctg 586 

Thr Leu Leu Leu Phe Val Val Leu Leu Ala Phe Cys Ser Ala Thr Leu 

150 155 160 

tac aaa ctg agg cat ctg agt tat aaa agt tgt gag agt cag tac tct 634 

10 Tyr Lys Leu Arg His Leu Ser Tyr Lys Ser Cys Glu Ser Gin Tyr Ser 
165 170 175 

gtc aac cca gag ctg gee acg atg tct tac ttt cat cca tea gaa ggt 682 

Val Asn Pro Glu Leu Ala Thr Met Ser Tyr Phe His Pro Ser Glu Gly 
180 185 190 

15 gtt tea gat aca tec ttt tec aag agt gca gag age age aca ttt ttg 730 

Val Ser Asp Thr Ser Phe Ser Lys Ser Ala Glu Ser Ser Thr Phe Leu 

195 200 205 210 

ggt ace act tct tea gat atg aga aga tea ggc aca aga aca tea gaa 778 

Gly Thr Thr Ser Ser Asp Met Arg Arg Ser Gly Thr Arg Thr Ser Glu 

20 215 220 225 

tct aag ata atg acg gat ate att tec ata ggc tea gat aat gag atg 826 

Ser Lys He Met Thr Asp He He Ser He Gly Ser Asp Asn Glu Met 

230 235 240 

cat gaa aac gat gag teg gtt acc egg tgaagaaatc aaggaacccg 873 

25 His Glu Asn Asp Glu Ser Val Thr Arg 
245 250 
gtgaagaaat cttattgatg aataaataac tttaattatt ttgtcatcaa aaaaaaaaaa 933 
aaa 936 

30 <210> 17 

<211> 747 
<212> DNA 
<213> Homo sapiens 

35 <220> 

<221> CDS 
<222> 73 . .672 

<220> 

40 <221> sig_peptide 
<222> 73 . . 132 
<223> Von Heijne matrix 

score 5.21332530399231 

seq SPVFLVFPPEITA/SE 

45 

<400> 17 

acaagaaaag aacatggtct agactgaagt accaactaaa tcatctcctt tcaaattatc 60 
accgacacca tc atg gat tea age acc gca cac agt ccg gtg ttt ctg gta 111 
Met Asp Ser Ser Thr Ala His Ser Pro Val Phe Leu Val 
50 -20 -15 -10 

ttt cct cca gaa ate act get tea gaa tat gag tec aca gaa ctt tea 159 
Phe Pro Pro Glu He Thr Ala Ser Glu Tyr Glu Ser Thr Glu Leu Ser 

-5 15 
gee acg acc ttt tea act caa age ccc ttg caa aaa tta ttt get aga 207 
55 Ala Thr Thr Phe Ser Thr Gin Ser Pro Leu Gin Lys Leu Phe Ala Arg 
10 15 20 25 

aaa atg aaa ate tta ggg act ate cag ate ctg ttt gga att atg acc 255 
Lys Met Lys He Leu Gly Thr He Gin He Leu Phe Gly He Met Thr 
30 " 35 40 

60 ttt tct ttt gga gtt ate ttc ctt ttc act ttg tta aaa cca tat cca 303 
Phe Ser Phe Gly Val He Phe Leu Phe Thr Leu Leu Lys Pro Tyr Pro 

45 50 55 

agg ttt ccc ttt ata ttt ctt tea gga tat cca ttc tgg ggc tct gtt 351 
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30 



Arg Phe Pro Phe He Phe Leu Ser Gly Tyr Pro Phe Trp Gly Ser Val 

60 65 70 

ttg ttc att aat tct gga gcc ttc eta att gca gtg aaa aga aaa acc 399 
Leu Phe He Asn Ser Gly Ala Phe Leu He Ala Val Lys Arg Lys Thr 
5 75 80 85 

aca gaa act ctg ata ata ttg age cga ata atg aat ttt ctt agt gcc 447 
Thr Glu Thr Leu He He Leu Ser Arg He Met Asn Phe Leu Ser Ala 
90 95 100 105 

ctg gga gca ata get gga ate att etc etc aca ttt ggt ttc ate eta 495 
10 Leu Gly Ala He Ala Gly He He Leu Leu Thr Phe Gly Phe He Leu 

110 115 120 

gat caa aac tac att tgt ggt tat tct cac caa aat agt cag tgt aag 543 
Asp Gin Asn Tyr He Cys Gly Tyr Ser His Gin Asn Ser Gin Cys Lys 
125 130 135 

15 get gtt act gtc ctg ttc ttg gga att ttg att aca ttg atg act ttc 591 
Ala Val Thr Val Leu Phe Leu Gly He Leu He Thr Leu Met Thr Phe 

140 145 150 

age att att gaa tta ttc att tct ctg cct ttc tea att ttg ggg tgc 639 
Ser He He Glu Leu Phe He Ser Leu Pro Phe Ser He Leu Gly Cys 
20 155 160 165 

cac tea gag gat tgt gat tgt gaa caa tgt tgt tgactagcac tgtgagaata 692 
His Ser Glu Asp Cys Asp Cys Glu Gin Cys Cys 
170 175 180 

aagatgtgtt aaaatattaa aaaaaaaaaa aaaaaaaaag aaaaaaaaaa aaaaa 747 

25 

<210> 18 
<211> 1884 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 94 . . 1275 

35 <220> 

<221> sig_peptide 

<222> 94 . .210 

<223> Von Heijne matrix 

score 4.55778392992629 
40 seq LVLVKRLLAVSVS/CI 

<400> 18 

acagegegtg cagcctcgtg cagctcttct ggtctccggc gcccgcccct cagaegtaat 60 
gttgaattaa agaaaatact ttatcagaag aag atg gcc act gcc cag ttg cag 114 
45 Met Ala Thr Ala Gin Leu Gin 

-35 

agg act ccc atg agt gca ctg gta ttt ccc aat aag ata tea act gaa 162 
Arg Thr Pro Met Ser Ala Leu Val Phe Pro Asn Lys He Ser Thr Glu 
-30 -25 -20 

50 cac cag tct ttg gtg tta gtg aag agg ctt eta gca gtt tea gta tec 210 
His Gin Ser Leu Val Leu Val Lys Arg Leu Leu Ala Val Ser Val Ser 

-15 -10 -5 

tgt ate acg tat ttg agg gga ata ttc cca gaa tgc get tat gga aca 258 
Cys He Thr Tyr Leu Arg Gly He Phe Pro Glu Cys Ala Tyr Gly Thr 
55 1 5 10 15 

aga tat eta gat gat ctt tgt gtc aaa ata ctg aga gaa gat aaa aat 306 
Arg Tyr Leu Asp Asp Leu Cys Val Lys He Leu Arg Glu Asp Lys Asn 

20 25 30 

tgc cca gga tct aca cag tta gtg aaa tgg att eta gga tgt tat gat 354 
60 Cys Pro Gly Ser Thr Gin Leu Val Lys Trp He Leu Gly Cys Tyr Asp 
35 40 45 

get tta cag aaa aaa tat eta agg atg gtt gtt eta get gta tac aca 402 
Ala Leu Gin Lys Lys Tyr Leu Arg Met Val Val Leu Ala Val Tyr Thr 
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50 55 60 

aac cca gaa gat cct cag aca att tea gaa tgt tac caa ttc aaa ttc 450 
Asn Pro Glu Asp Pro Gin Thr lie Ser Glu Cys Tyr Gin Phe Lys Phe 
65 70 75 80 

5 aaa tac acc aat aat gga cca etc atg gac ttc ata agt aaa aac caa 498 
Lys Tyr Thr Asn Asn Gly Pro Leu Met Asp Phe lie Ser Lys Asn Gin 

85 90 95 

age aac gaa tct age atg ttg tct act gac acc aag aaa gca age att 546 
Ser Asn Glu Ser Ser Met Leu Ser Thr Asp Thr Lys Lys Ala Ser lie 
10 100 105 110 

etc etc att cgc aag att tat ate eta atg caa aat ctg ggg cct tta 594 
Leu Leu He Arg Lys He Tyr He Leu Met Gin Asn Leu Gly Pro Leu 

115 120 125 

cct aat gat gtt tgt ttg acc atg aaa ctt ttt tac tat gat gaa gtt 642 
15 Pro Asn Asp Val Cys Leu Thr Met Lys Leu Phe Tyr Tyr Asp Glu Val 
130 135 140 

aca ccc cca gat tac cag cct ccc ggt ttt aag gat ggt gat tgt gaa 690 
Thr Pro Pro Asp Tyr Gin Pro Pro Gly Phe Lys Asp Gly Asp Cys Glu 
145 150 155 160 

20 gga gtt ata ttt gaa ggg gaa cct atg tat tta aat gtg gga gaa gtc 738 
Gly Val He Phe Glu Gly Glu Pro Met Tyr Leu Asn Val Gly Glu Val 

165 170 175 

tea aca cct ttt cac ate ttc aaa gta aaa gtg acc act gag aga gaa 786 
Ser Thr Pro Phe His He Phe Lys Val Lys Val Thr Thr Glu Arg Glu 
25 180 185 190 

cga atg gaa aat att gac tea act ata eta tea cca aaa caa ata aaa 834 
Arg Met Glu Asn He Asp Ser Thr He Leu Ser Pro Lys Gin He Lys 

195 200 205 

aca cca ttt caa aaa ate ctg agg gac aaa gat gta gaa gat gaa cag 882 
30 Thr Pro Phe Gin Lys He Leu Arg Asp Lys Asp Val Glu Asp Glu Gin 
210 215 220 

gag cat tat aca agt gat gat ttg gac att gaa act aaa atg gaa gaa 930 
Glu His Tyr Thr Ser Asp Asp Leu Asp He Glu Thr Lys Met Glu Glu 
225 230 235 240 

35 cag gaa aaa aac cct gca tct tct gaa ctt gaa gaa cca agt tta gtt 978 
Gin Glu Lys Asn Pro Ala Ser Ser Glu Leu Glu Glu Pro Ser Leu Val 

245 250 255 

tgt gag gaa gat gaa att atg agg tct aaa gaa agt cca gat ctt tct 1026 
Cys Glu Glu Asp Glu He Met Arg Ser Lys Glu Ser Pro Asp Leu Ser 
40 260 265 270 

att tct cat tct cag gtt gag cag tta gtc aat aaa aca tct gaa ctt 1074 
He Ser His Ser Gin Val Glu Gin Leu Val Asn Lys Thr Ser Glu Leu 

275 280 285 

gat atg tct gaa age aaa aca aga agt gga aaa gtc ttt cag aat aaa 1122 
45 Asp Met Ser Glu Ser Lys Thr Arg Ser Gly Lys Val Phe Gin Asn Lys 
290 295 300 

atg gca aat gga aat caa cca gta aaa tct tec aaa gaa aat egg aag 1170 
Met Ala Asn Gly Asn Gin Pro Val Lys Ser Ser Lys Glu Asn Arg Lys 
305 310 315 320 

50 aga agt caa cat gaa tct ggg aga ata gtc etc cat cac ttt gat tct 1218 
Arg Ser Gin His Glu Ser Gly Arg He Val Leu His His Phe Asp Ser 

325 330 335 

tct agt caa gag tea gtg cca aaa agg aga aag ttt agt gaa cca aag 1266 
Ser Ser Gin Glu Ser Val Pro Lys Arg Arg Lys Phe Ser Glu Pro Lys 
55 340 345 350 

gaa cat ata taaaaattat ttttgttctg caggcttgea gagttcttct 1315 
Glu His lie 
355 

caccatttaa actgaaggac cctatattat atttccctaa ctctgaagat gtatatgtag 1375 
60 tttaaagcag tttatacact aaaactaagt ttttggctga ctgtcatatt gtggtcctta 1435 
atcttgagat aaatccaata gaacttttga ataaaagcaa aagtacaaat gtcataattg 1495 
atteggtaat aagtaaaatt tcaaaattga ttttgttcat tacctactta atatttcctt 1555 
taaatatata ctaactgtta aggccctcta atgccatttt tctaaacagt aatgtttact 1615 
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ttggtattaa aatttggtat tgattcactt tttacttatg ttaaaattat accatttaac 1675 

tggctctttt gtcattgtgc tgttattaaa acaatgttct tcaatatttt gacataatgt 1735 

attaacattt taatatataa tgtacaattt aagaattggt gctttacctt tactatgctt 1795 

tttttacagg acaaaaagac tgatttttaa agtatggcat tttttgcagc ataaataaaa 1855 

5 tattgttcag tacgaaaaaa aaaaaaaaa 1884 

<210> 19 
<211> 691 
<212> DNA 
10 <213> Homo sapiens 



15 



20 



<220> 
<221> CDS 
<222> 42 . . 



515 



<220> 

<221> sig_peptide 

<222> 42 . . 92 

<223> Von Heijne matrix 

score 10.7019149919754 
seq VLMLLAVLIWTGA/EN 



25 



30 



35 



40 



45 



50 



55 



60 



<400> 19 

gagttgtcct gtgctggagg tctgctcaga cgaaggtctc c atg gcg tta gaa gtc 

Met Ala Leu Glu Val 
-15 

ttg atg etc etc get gtc ttg att tgg ace ggt get gag aac etc cat 
Leu Met Leu Leu Ala Val Leu lie Trp Thr Gly Ala Glu Asn Leu His 

-10 -5 1 

gtg aaa ata agt tgc tct ctg gac tgg ttg atg gtc tea gtt ate cca 
Val Lys lie Ser Cys Ser Leu Asp Trp Leu Met Val Ser Val lie Pro 
5 10 15 2 0 

gtt gca gaa age aga aat ctg tat ata ttt gcg gat gaa tta cat ctg 
Val Ala Glu Ser Arg Asn Leu Tyr lie Phe Ala Asp Glu Leu His Leu 

25 30 35 

gga atg ggc tgc cct gca aat egg ata cat aca tat gta tat gag ttt 
Gly Met Gly Cys Pro Ala Asn Arg lie His Thr Tyr Val Tyr Glu Phe 

40 45 50 

ata tat ctt gtt cgt gat tgt ggc ate agg aca agg gta gtt tct gag 
lie Tyr Leu Val Arg Asp Cys Gly He Arg Thr Arg Val Val Ser Glu 



55 



60 



65 



gaa act etc ctt ttt caa acc gag ctg tac ttt acc cca agg aat ata 
Glu Thr Leu Leu Phe Gin Thr Glu Leu Tyr Phe Thr Pro Arg Asn He 



70 



75 



80 



gat cat gac cct cag gaa ate cat ttg gag tgt tec acc tct agg aaa 
Asp His Asp Pro Gin Glu He His Leu Glu Cys Ser Thr Ser Arg Lys 



85 



90 



95 



100 



tea gtg tgg ctt aca cca gtt tct act gag aat gaa ata aaa ttg gat 

Ser Val Trp Leu Thr Pro Val Ser Thr Glu Asn Glu He Lys Leu Asp 

105 110 115 

cct agt cct ttt att get gac ttt cag aca aca gca gaa gag tta gga 

Pro Ser Pro Phe He Ala Asp Phe Gin Thr Thr Ala Glu Glu Leu Gly 

120 125 130 

tta tta tct tct agt cca aac ttg etc tgagctaaag gagaaatgga 
Leu Leu Ser Ser Ser Pro Asn Leu Leu 

135 140 

aacttgaagc tggtgttatg tattttgeag gaaaacagtt tcattttttc atagcaaaaa 

tatagttggt gtatatctct ccttaagtct ctggtttcta aaaaccctac ttcagtaaag 
gtcctgatta gttgattagc gaaaaaaaaa aaaaaa 

<210> 20 
<211> 1138 
<212> DNA 



56 



104 



152 



200 



248 



296 



344 



392 



440 



488 



535 



595 
655 
691 
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<213> Homo sapiens 

<220> 
<221> CDS 
5 <222> 271. .969 

<220> 

<221> sig_peptide 
<222> 271 . .366 
10 <223> Von Heijne matrix 

score 5.6680378526706 

seq WMGLACFRSLAAS/SP 

<220> 

15 <221> misc_feature 
<222> 989 

<223> n=a, g, c or t 
<400> 20 

20 aaaaaccttt caagtgcccc ctcctttcct taaagtcttt tataggggtc cccttcttgg 60 
ccatctccat cctgtgagtc aggactgaaa gggcacagac aggtcactgc cagcattgtt 120 
ggggcaagcc tgcaagcacg catcactggg gatctgacat gacaatggcc gcctgccccc 180 
tctgagggct acaggactta ccccagtggg aagcagctaa gcaggtctga ccagccgacc 240 
tggacctggc caagggtcct gtcatccctc atg gcc acc ccg cca ttc egg ctg 294 
25 Met Ala Thr Pro Pro Phe Arg Leu 

-30 -25 
ata agg aag atg ttt tec ttc aag gtg age aga tgg atg ggg ctt gcc 342 
lie Arg Lys Met Phe Ser Phe Lys Val Ser Arg Trp Met Gly Leu Ala 
-20 -15 -10 

30 tgc ttc egg tec ctg gcg gca tec tct ccc agt att cgc cag aag aaa 390 
Cys Phe Arg Ser Leu Ala Ala Ser Ser Pro Ser lie Arg Gin Lys Lys 

-5 1 5 

eta atg cac aag ctg cag gag gaa aag get ttt cgc gaa gag atg aaa 43 8 

Leu Met His Lys Leu Gin Glu Glu Lys Ala Phe Arg Glu Glu Met Lys 
35 10 15 20 

att ttt cgt gaa aaa ata gag gac ttc agg gaa gag atg tgg act ttc 486 
lie Phe Arg Glu Lys lie Glu Asp Phe Arg Glu Glu Met Trp Thr Phe 
25 30 35 40 

cga ggc aag ate cat get ttc egg ggc cag ate ctg ggt ttt tgg gaa 534 
40 Arg Gly Lys lie His Ala Phe Arg Gly Gin lie Leu Gly Phe Trp Glu 

45 50 55 

gag gag aga cct ttc tgg gaa gag gag aaa acc ttc tgg aaa gag gaa 582 
Glu Glu Arg Pro Phe Trp Glu Glu Glu Lys Thr Phe Trp Lys Glu Glu 
60 65 70 

45 aaa tec ttc tgg gaa atg gaa aag tct ttc agg gag gaa gag aaa act 630 
Lys Ser Phe Trp Glu Met Glu Lys Ser Phe Arg Glu Glu Glu Lys Thr 

75 80 85 

ttc tgg aaa aag tac cgc act ttc tgg aag gag gat aag gcc ttc tgg 678 
Phe Trp Lys Lys Tyr Arg Thr Phe Trp Lys Glu Asp Lys Ala Phe Trp 
50 90 95 100 

aaa gag gac aat gcc tta tgg gaa aga gac egg aac ctt ctt cag gag 726 
Lys Glu Asp Asn Ala Leu Trp Glu Arg Asp Arg Asn Leu Leu Gin Glu 
105 110 115 120 

gac aag gcc ctg tgg gag gaa gaa aag gcc ctg tgg gta gag gaa aga 774 
55 Asp Lys Ala Leu Trp Glu Glu Glu Lys Ala Leu Trp Val Glu Glu Arg 

125 130 135 

gcc etc ctt gag ggg gag aaa gcc ctg tgg gaa gat aaa acg tec etc 822 
Ala Leu Leu Glu Gly Glu Lys Ala Leu Trp Glu Asp Lys Thr Ser Leu 
140 145 150 

60 tgg gag gaa gag aat gcc etc tgg gag gaa gag agg gcc ttc tgg atg 870 
Trp Glu Glu Glu Asn Ala Leu Trp Glu Glu Glu Arg Ala Phe Trp Met 

155 160 165 

gag aac aat ggc cac att gcc gga gag cag atg etc gaa gat ggg ccc 918 

25 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 



10 



15 



Glu Asn Asn Gly His lie Ala Gly Glu Gin Met Leu Glu Asp Gly Pro 

170 175 180 

cac aac gcc aac aga ggg cag cgc ttg ctg gcc ttc tec cga ggc agg 966 
His Asn Ala Asn Arg Gly Gin Arg Leu Leu Ala Phe Ser Arg Gly Arg 
185 190 195 200 

gcg tagecagcat geaggtgean gggccctgtg gtccagactc ccctgggttg 1019 
Ala 

ggattcaagt ccagggtgag cccatgtgct ggagaaaata cacactcatt ggtctccttg 1079 
ctttgaaaga tccaataaag tcctgaggca aggtttggaa aaccaaaaaa aaaaaaaaa 1138 

<210> 21 
<211> 468 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 76 . .276 

20 <220> 

<221> sig_ peptide 

<222> 76 . . 135 

<223> Von Heijne matrix 

score 5.21332530399231 
25 seq SPVFLVFPPEITA/SE 

<400> 21 

agcacaagaa aagaacatgg tctagactga agtaccaact aaatcatctc ctttcaaatt 60 
atcaccgaca ccatc atg gat tea age ace gca cac agt ccg gtg ttt ctg 111 
30 Met Asp Ser Ser Thr Ala His Ser Pro Val Phe Leu 

-20 -15 -10 

gta ttt cct cca gaa ate act get tea gaa tat gag tec aca gaa ctt 159 
Val Phe Pro Pro Glu lie Thr Ala Ser Glu Tyr Glu Ser Thr Glu Leu 
-5 15 
35 tea gcc acg acc ttt tea act caa age ccc ttg caa aaa tta ttt get 207 
Ser Ala Thr Thr Phe Ser Thr Gin Ser Pro Leu Gin Lys Leu Phe Ala 

10 15 20 

aga aaa atg aaa ate tta ggg gat ate cat tct ggg get ctg ttt tgt 255 
Arg Lys Met Lys lie Leu Gly Asp lie His Ser Gly Ala Leu Phe Cys 
40 25 30 - 35 40 

tea tta att ctg gag cct tec taattgcagt gaaaagaaaa accacagaaa 306 
Ser Leu lie Leu Glu Pro Ser 
45 

ctctgggaat tttgattaca ttgatgactt tcagcattat tgaattattc atttctctgc 366 
45 ctttctcaat tttggggtgc cactcagagg attgtgattg tgaacaatgt tgttgactag 426 
cactgtgaga ataaagatgt gttaaaataa aaaaaaaaaa aa 468 

<210> 22 

<211> 720 

50 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
55 <222> 6. .287 

<220> 

<221> sig_peptide 
<222> 6 . . 80 
60 <223> Von Heijne matrix 

score 4.17710408129886 

seq ISLSHLFLDLSRS/LW 
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194 



<400> 22 

atttg atg tgc ttc tta gtc teg ttt aac ttg ccg att cat ata tec ctg 50 
Met Cys Phe Leu Val Ser Phe Asn Leu Pro lie His lie Ser Leu 
-25 -20 -15 

5 tct cat ttg ttc tta gat ttg tea cga age etc tgg ttt ttg get tgt 98 
Ser His Leu Phe Leu Asp Leu Ser Arg Ser Leu Trp Phe Leu Ala Cys 
-10 -5 15 

cct ggt ttg aac ttg gtg tat ctg get ctt gac tea ttt tct gac etc 146 
Pro Gly Leu Asn Leu Val Tyr Leu Ala Leu Asp Ser Phe Ser Asp Leu 

10 ^ 10 15 20 

aga cca tec tta aat ctg ctt ttc tac ttt gta cca ggc ttt ggc gtc 
Arg Pro Ser Leu Asn Leu Leu Phe Tyr Phe Val Pro Gly Phe Gly Val 

25 30 35 

tec aag tac ctg ace tea get caa cct gtc ttg ggt ttt ctt etc etc 242 

15 Ser Lys Tyr Leu Thr Ser Ala Gin Pro Val Leu Gly Phe Leu Leu Leu 
40 45 50 

cct gac att gac aac cca gee etc eta ggc aca gag aga tgg age 287 
Pro Asp lie Asp Asn Pro Ala Leu Leu Gly Thr Glu Arg Trp Ser 
55 60 65 

20 tgagtgtggt tttcctgaaa taaagcttgc attatgagag ggaataaaca gaagaaaaaa 347 
atagtaagta aaatcttget tgcctctcag taaaataaag ctctattttt cgtttttttt 407 
ttttccaact tcctgtacaa aaaagggaaa actttagctt ttgggggaaa tttggagcta 467 
gcctgttggt actgttgagc ttagtgtatc tataactata tattattcca caatatctta 527 
aatactttat aaagatattt tcataaatta cagcaatcct ggctttagat gattgatggc 587 

25 catttttaaa caattaaagc taatttctag ctttttatga gtttggtatt aagcacagta 647 
gtttcttaga aagtctccag ggaatgeatt ttgeaaaata aaaatcagct aatgacccaa 707 
aaaaaaaaaa aaa 720 

<210> 23 
30 <211> 727 
<212> DNA 
<213> Homo sapiens 

<220> 
35 <221> CDS 

<222> 171 . .692 

<220> 

<221> sig_peptide 
40 <222> 171. .227 

<223> Von Heijne matrix 

score 4.17573075349936 

seq LLLGQRCSLKVSG/QE 

45 <400> 23 

attgtgacat caccgtgcac tagecaatgg ctgcctgcct aagctgggtc cctggtctcc 60 
tgggactact agecctttgt tgatagggag aagecaacat ctcccgcagg accccctaat 120 
cttcagggca gctcccagag catggatccc tcctgattcc actcagcccg atg ttc 176 

Met Phe 

50 etc aca gtc aag ctg etc ctg ggc cag aga tgc agt ctg aag gtg tea 224 
Leu Thr Val Lys Leu Leu Leu Gly Gin Arg Cys Ser Leu Lys Val Ser 

-15 -10 -5 

ggg caa gag agt gta gee acg ctg aag aga ctg gtg tec agg egg ctg 272 
Gly Gin Glu Ser Val Ala Thr Leu Lys Arg Leu Val Ser Arg Arg Leu 
55 1 5 10 15 

aag gtg cct gag gag cag cag cac ctg ctt ttc cgt ggc cag etc ctg 320 
Lys Val Pro Glu Glu Gin Gin His Leu Leu Phe Arg Gly Gin Leu Leu 

20 25 30 

gag gat gac aag cac etc tct gac tac tgc att ggg ccc aat gee tct 368 
60 Glu Asp Asp Lys His Leu Ser Asp Tyr Cys He Gly Pro Asn Ala Ser 
35 40 45 

ate aat gtc ate atg cag ccc ttg gag aag atg gcg eta aag gag gec 416 
He Asn Val He Met Gin Pro Leu Glu Lys Met Ala Leu Lys Glu Ala 
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50 55 60 

cac cag ccg cag acc cag ccc ctg tgg cac cag ctg gga ctg gtc eta 464 

His Gin Pro Gin Thr Gin Pro Leu Trp His Gin Leu Gly Leu Val Leu 
65 70 75 

5 get aaa cac ttt gaa cca cag gat gec aag gec gtg ctg cag ctg eta 512 

Ala Lys His Phe Glu Pro Gin Asp Ala Lys Ala Val Leu Gin Leu Leu 
80 85 90 95 

agg cag gag cac gag gag cgc ctg cag aag ata age ctg gag cac ctg 560 

Arg Gin Glu His Glu Glu Arg Leu Gin Lys lie Ser Leu Glu His Leu 

10 100 105 110 

gag cag ctg gec cag tac etc ctg gca gag gag cct cac gtg gag cca 608 

Glu Gin Leu Ala Gin Tyr Leu Leu Ala Glu Glu Pro His Val Glu Pro 

115 120 125 

get gga gag agg gag ctt gag gcg aag gca egg cct cag age tec tgt 656 

15 Ala Gly Glu Arg Glu Leu Glu Ala Lys Ala Arg Pro Gin Ser Ser Cys 

130 135 140 

gac atg gag gag aag gag gag gca gca get gat cag taaaegggee 702 

Asp Met Glu Glu Lys Glu Glu Ala Ala Ala Asp Gin 
145 150 155 
20 atcctacccg aaaaaaaaaa aaaaa 727 



<210> 24 
<211> 470 
<212> DNA 
25 <213> Homo sapiens 



30 



<220> 
<221> CDS 
<222> 137 . .454 



<220> 

<221> sig_peptide 
<222> 137 . . 187 
<223> Von Heijne matrix 
35 score 10.7019149919754 

seq VLMLLAVLI WTGA/ EN 



<400> 24 

atcctgtgaa ctacccaaaa ggaggaaaac gaaegcaget gagcatggga tgccatataa 60 
40 aaatcactta aaccagtcgc cactccttgt ttcctgagtt gtcctgtgct ggaggtctgc 120 
tcagacgaag gtctcc atg gcg tta gaa gtc ttg atg etc etc get gtc ttg 172 

Met Ala Leu Glu Val Leu Met Leu Leu Ala Val Leu 
-15 -10 
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<210> 25 
<211> 987 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 238 . .609 



10 <220> 

<221> sig_peptide 

<222> 238 . .291 

<223> Von Heijne matrix 

score 10.0374888212272 
15 seq LLLLVMALPPGTT/GV 



<400> 25 

attccattca cagactcttg ttgggcagca gccacccgct 
agagggacgc agggcgttgg gaacagagga cactccaggc 

20 accagggcca aagtcccgtg ggcaagagga gtcctcagag 
gggaggtctg ggaagcccac ggcctggctg gggcagggtc 
atg gtc ctg tgc tgg ctg ctg ctt ctg gtg atg 
Met Val Leu Cys Trp Leu Leu Leu Leu Val Met 
-15 -10 

25 acg acg ggc gtc aag gac tgc gtc ttc tgt gag 
Thr Thr Gly Val Lys Asp Cys Val Phe Cys Glu 

1 5 
cag tgt cct ggt acc tac atg cac tgt ggc gat 
Gin Cys Pro Gly Thr Tyr Met His Cys Gly Asp 

30 15 20 25 

aca ggc cac ggg gtc gcc ccg ggc act ggt ccg 
Thr Gly His Gly Val Ala Pro Gly Thr Gly Pro 

35 40 
tgc ctg cga gcc acc age tgc ggc ctt gag gaa 

35 Cys Leu Arg Ala Thr Ser Cys Gly Leu Glu Glu 
50 55 
ggc gtc acc tac age etc acc acc aac tgc tgc 
Gly Val Thr Tyr Ser Leu Thr Thr Asn Cys Cys 
65 70 

40 aac aga gcc ccg age age cag aca gtg ggg gcc 
Asn Arg Ala Pro Ser Ser Gin Thr Val Gly Ala 

80 85 
ctg ggg ctg ggt atg ctg ctt cct cca cgt ttg 
Leu Gly Leu Gly Met Leu Leu Pro Pro Arg Leu 

45 95 100 105 

ggaggacagg gectgggact gttctcccag atccgccact 
cccccactaa atggccagag aggccctgga caacctcttg 
ctaaggctgt ccaccaggag cccggtgcta ggggaagcat 
caggggagca cggcccgtgg gtttgattgt attactctgt 

50 gagcttctca catctcaatc aggatgette tctccattgg 
aatatggtaa aaaatatata tatatcataa taaatgacag 
aaaaaaaa 



cacctccatc cccaggactt 
gctgaccctg ggaggecagg 
gtccttcatt cagcggttcc 
aacgccgcca ggccgcc 
get ctg ccc cca ggc 
Ala Leu Pro Pro Gly 
-5 

etc acc gac tec atg 
Leu Thr Asp Ser Met 
10 

gac gag gac tgc ttc 
Asp Glu Asp Cys Phe 
30 

gtc ate aac aaa ggc 
Val lie Asn Lys Gly 
45 

ccc gtc age tac agg 
Pro Val Ser Tyr Arg 
60 

acc ggc cgc ctg tgt 
Thr Gly Arg Leu Cys 
75 

acc acc age ctg gca 
Thr Thr Ser Leu Ala 
90 

ctg tgaccaacag 
Leu 



ccccatgtcc 
cggccctggc 
ccccaggcct 
tccactggtt 
tagcacttta 
ctgatgttca 



ccatgtcctt 
ttcatccctt 
gaetgagegg 
etaagacgea 
gagtccatga 
tggaaaaaaa 



60 
120 
180 
237 
285 



333 



381 



429 



477 



525 



573 



619 



679 
739 
799 
859 
919 
979 
987 



<210> 26 

55 <211> 908 

<212> DNA 

<213> Homo sapiens 

<220> 

60 <221> CDS 

<222> 80. . 862 

<220> 
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<221> sig_peptide 

<222> 80. .127 

<223> Von Heijne matrix 

score 3.66725851505537 
5 seq FSLLSISGPPISS/SA 

<400> 26 

gaatgtttat cctctggaca aaccagccag cctctccaga gcaggcgtgt gatctctgta 60 

cccccgcagt ggtcagaat atg gag aac ttc tea etc etc age ate tct gga 112 

10 Met Glu Asn Phe Ser Leu Leu Ser lie Ser Gly 

-15 -10 

cct cca ate tct tec tec gec ctg agt get ttt ccc gac att atg ttc 160 

Pro Pro lie Ser Ser Ser Ala Leu Ser Ala Phe Pro Asp lie Met Phe 

-5 15 10 

15 tct cgt gee acc age ctg cca gac att gca aag aca gca gta ccc act 208 

Ser Arg Ala Thr Ser Leu Pro Asp lie Ala Lys Thr Ala Val Pro Thr 

15 20 25 

gag gca tec age cca get cag gee ctg cca ccc cag tac caa age ate 256 

Glu Ala Ser Ser Pro Ala Gin Ala Leu Pro Pro Gin Tyr Gin Ser lie 

20 30 35 40 

att gtc agg caa ggg ata cag aac aca gtg etc tea cca gac tgc age 304 

lie Val Arg Gin Gly lie Gin Asn Thr Val Leu Ser Pro Asp Cys Ser 

45 50 55 

ttg ggg gac acc cag cac gga gag aag ctg agg egg aac tgc act ate 352 

25 Leu Gly Asp Thr Gin His Gly Glu Lys Leu Arg Arg Asn Cys Thr lie 

60 65 70 75 

tac egg ccc tgg ttc tec ccc tac age tac ttc gtg tgt gca gac aaa 400 

Tyr Arg Pro Trp Phe Ser Pro Tyr Ser Tyr Phe Val Cys Ala Asp Lys 

80 85 90 

30 gag age cag ctg gag gee tat gac ttc cca gag gtg cag cag gat gag 448 

Glu Ser Gin Leu Glu Ala Tyr Asp Phe Pro Glu Val Gin Gin Asp Glu 

95 100 105 

ggc aag tgg gac aac tgc ctt tct gag gac atg get gag aac ate tgt 4 96 

Gly Lys Trp Asp Asn Cys Leu Ser Glu Asp Met Ala Glu Asn lie Cys 

35 110 115 12 0 

teg tec tct tec tec cca gag aac act tgc cct cga gaa gee acc aag 544 

Ser Ser Ser Ser Ser Pro Glu Asn Thr Cys Pro Arg Glu Ala Thr Lys 

125 130 135 

aaa tec agg cat ggc ctg gac tec ate aca tec cag gac ate eta atg 592 

40 Lys Ser Arg His Gly Leu Asp Ser lie Thr Ser Gin Asp lie Leu Met 

140 145 150 155 

get tec aga tgg cac cca gca cag cag aat ggc tac aag tgc gtg gee 640 

Ala Ser Arg Trp His Pro Ala Gin Gin Asn Gly Tyr Lys Cys Val Ala 

160 165 170 

45 tgc tgc cgc atg tac ccc acc ctg gac ttc etc aag age cac ate aag 688 

Cys Cys Arg Met Tyr Pro Thr Leu Asp Phe Leu Lys Ser His He Lys 

175 180 185 

ag9 9gc ttc agg gag ggc ttc age tgc aag gtg tac tac cgc aag etc 736 

Arg Gly Phe Arg Glu Gly Phe Ser Cys Lys Val Tyr Tyr Arg Lys Leu 

50 190 195 200 

aaa gee etc tgg age aag gag cag aag gee egg ctg gga gac agg etc 784 
Lys Ala Leu Trp Ser Lys Glu Gin Lys Ala Arg Leu Gly Asp Arg Leu 

205 210 215 

tec tec ggc age tgc cag gee ttc aat agt cct get gaa cac ctt agg 832 

55 Ser Ser Gly Ser Cys Gin Ala Phe Asn Ser Pro Ala Glu His Leu Arg 

220 225 230 235 

caa att ggc ggt gaa gee tac tta tgt etc tagagagatg ccaataaagt 882 
Gin He Gly Gly Glu Ala Tyr Leu Cys Leu 
240 245 

60 tagtcacagc caaaaaaaaa aaaaaa 908 

<210> 27 
<211> 762 

30 
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<212> DNA 

<213> Homo sapiens 

<220> 
5 <221> CDS 

<222> 83 . .310 

<220> 

<221> sig_peptide 
10 <222> 83. .157 

<223> Von Heijne matrix 

score 4.72955689475746 

seq LCALLSNFCPSTT/VK 

15 <400> 27 

ttttttctac tacaaacgcc atggggatgc gggtctggga acagcggaaa accctaccct 60 

gccctgaaaa gtccctggct ca atg tgc atg tec ctt tct atg aaa gtt cct 112 

Met Cys Met Ser Leu Ser Met Lys Val Pro 

-25 -20 

20 tgc tgc eta tgc gec ttg etc tct aac ttc tgt ccc tec aca act gtg 160 

Cys Cys Leu Cys Ala Leu Leu Ser Asn Phe Cys Pro Ser Thr Thr Val 

-15 -10 -5 1 

aaa gga gac gtc gtg act tec ttc ttt cgt get gac tat gac tta gee 208 

Lys Gly Asp Val Val Thr Ser Phe Phe Arg Ala Asp Tyr Asp Leu Ala 

25 S 10 15 

agt agg tct gca gat cag tec tec cag aaa gtg aag ttg cgc atg ttc 256 

Ser Arg Ser Ala Asp Gin Ser Ser Gin Lys Val Lys Leu Arg Met Phe 

20 25 30 

act 999 C9t ctt ccc atc 99 c ccc ttc 9 CC a 9t gtg ggg aac gcg gcg 304 

30 Thr Gly Arg Leu Pro lie Gly Pro Phe Ala Ser Val Gly Asn Ala Ala 

35 40 45 

gag ctg tgagccggcg actcgggtcc ctgaggtctg gattctttct ccgctactga 360 
Glu Leu 
50 

35 gaeaeggegg acacacacaa acacagaacc acacagccag tcccaggagc ccagtaatgg 420 

agagccccaa aaagaagaac cagcagctga aagtegggat cctacacctg ggcagcagac 480 

agaagaagat caggatacag ctgagatccc agtgcgcgac atggaaggtg atetgeaaga 540 

getgeatcag tcaaacaccg gggataaatc tggatttggg ttccggcgtc aaggtgaaga 600 

taatacctaa agaggaacac tgtaaaatgc cagaagcagg tgaagagcaa ccacaagttt 660 

40 aaatgaagac aagctgaaac aacgeaaget ggttttatat tagatatttg acttaaacta 720 

tctcaataaa gttttgeage tttcaccaaa aaaaaaaaaa aa 762 

<210> 28 

<211> 1102 

45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 310. .906 

<220> 

<221> sig_peptide 
<222> 310 . .357 
55 <223> Von Heijne matrix 

score 11.0931109030915 

seq FPLLLLSLGLVLA/EA 

<400> 28 

60 atacagtgac ctagagcagg catgggtggg tcacaggctt tggagagcac tctctgtcct 60 

gatcttttca gttgagagac ttcagctgtt cattgetcat ttggacttag ttcaaggtca 120 

tgtcaaagaa gaaggtgcac ttacgctagt tgttagctct gtcttttgta accatcaagt 180 

tecatgegat tgatcagatt taggaggggg cgttggggga taatcaattt tgggtgtcac 240 
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caggtaaaca gagccctcag catctgaata gaaactgaac aggaacagaa gagattcact 300 
acatctgag atg gag acc ttt cct ctg ctg ctg etc age ctg ggc ctg gtt 351 
Met Glu Thr Phe Pro Leu Leu Leu Leu Ser Leu Gly Leu Val 
-15 -10 -5 

5 ctt gca gaa gca tea gaa age aca atg aag ata att aaa gaa gaa ttt 399 

Leu Ala Glu Ala Ser Glu Ser Thr Met Lys lie lie Lys Glu Glu Phe 

1 5 10 

aca gac gaa gag atg caa tat gac atg gca aaa agt ggc caa gaa aaa 447 

Thr Asp Glu Glu Met Gin Tyr Asp Met Ala Lys Ser Gly Gin Glu Lys 
10 15 20 25 30 

cag acc att gag ata tta atg aac ccg ate ctg tta gtt aaa aat acc 495 

Gin Thr lie Glu He Leu Met Asn Pro lie Leu Leu Val Lys Asn Thr 

35 40 45 

age etc age atg tec aag gat gat atg tct tec aca tta ctg aca ttc 543 

15 Ser Leu Ser Met Ser Lys Asp Asp Met Ser Ser Thr Leu Leu Thr Phe 
50 55 60 

aga agt tta cat tat aat gac ccc aag gga aac agt teg ggt aat gac 591 

Arg Ser Leu His Tyr Asn Asp Pro Lys Gly Asn Ser Ser Gly Asn Asp 
65 70 75 

20 aaa gag tgt tgc aat gac atg aca gtc tgg aga aaa gtt tea gaa gca 639 

Lys Glu Cys Cys Asn Asp Met Thr Val Trp Arg Lys Val Ser Glu Ala 

80 85 90 

aac gga teg tgc aag tgg age aat aac ttc ate cgc age tec aca gaa 687 

Asn Gly Ser Cys Lys Trp Ser Asn Asn Phe He Arg Ser Ser Thr Glu 
25 95 100 105 110 

gtg atg cgc agg gtc cac agg gee ccc age tgc aag ttt gta cag aat 735 

Val Met Arg Arg Val His Arg Ala Pro Ser Cys Lys Phe Val Gin Asn 

115 120 125 

cct ggc ata age tgc tgt gag age eta gaa ctg gaa aat aca gtg tgc 783 

30 Pro Gly lie Ser Cys Cys Glu Ser Leu Glu Leu Glu Asn Thr Val Cys 
130 135 140 

cag ttc act aca ggc aaa caa ttc ccc agg tgc caa tac cat agt gtt 831 

Gin Phe Thr Thr Gly Lys Gin Phe Pro Arg Cys Gin Tyr His Ser Val 
145 150 155 

35 acc tea tta gag aag ata ttg aca gtg ctg aca ggt cat tct ctg atg 879 
Thr Ser Leu Glu Lys He Leu Thr Val Leu Thr Gly His Ser Leu Met 

160 165 170 

age tgg tta gtt tgt ggc tct aag ttg taaatcccac agagctttag 926 
Ser Trp Leu Val Cys Gly Ser Lys Leu 
40 175 180 

gactagggtc ttactaaaga aggacctctt cttgttcatt cttgtttaaa cctttcctta 986 

atatctactc tttagcacta tagtgaactc ctgattattt attctaactg gaggagtgaa 1046 

aaatccaaaa ttgtggataa ttcaattaaa agttatgact gaaaaaaaaa aaaaaa 1102 

45 <210> 29 

<211> 436 

<212> DNA 

<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 24. .287 

<220> 

55 <221> sig_peptide 
<222> 24 . . 131 
<223> Von Heijne matrix 

score 3.79790641648006 
seq I LMRDFS PSGI FG / AF 



60 



<400> 29 

acageggaca ccaggactcc aaa atg gcg tea gtt gta cca gtg aag gac aag 53 

Met Ala Ser Val Val Pro Val Lys Asp Lys 

32 
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25 



-35 -30 
aaa ctt ctg gag gtc aaa ctg ggg gag ctg cca age tgg ate ttg atg 101 
Lys Leu Leu Glu Val Lys Leu Gly Glu Leu Pro Ser Trp lie Leu Met 
-25 -20 -15 

5 egg gac ttc agt cct agt ggc att ttc gga gcg ttt caa aga ggt tac 149 
Arg Asp Phe Ser Pro Ser Gly He Phe Gly Ala Phe Gin Arg Gly Tyr 
-10 -5 15 

tac egg tac tac aac aag tac ate aat gtg aag aag ggg age ate teg 197 
Tyr Arg Tyr Tyr Asn Lys Tyr He Asn Val Lys Lys Gly Ser He Ser 

10 10 15 20 

ggg att ace atg gtg ctg gca tgc tac gtg etc ttt age tac tec ttt 245 
Gly He Thr Met Val Leu Ala Cys Tyr Val Leu Phe Ser Tyr Ser Phe 

25 30 35 

tec tac aag cat etc aag cac gag egg etc cgc aaa tac cac 287 

15 Ser Tyr Lys His Leu Lys His Glu Arg Leu Arg Lys Tyr His 
40 45 50 

tgaagaggac acactctgca cccccccacc ccacgacctt ggcccgagcc cctccgtgag 347 
gaacacaatc tcaatcgttg ctgaatcctt tcatatccta ataggaatta acctccaaat 407 
aaaacatgac tggtaaaaaa aaaaaaaaa 436 

20 

<210> 30 
<211> 1938 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 132 . . 1574 

30 <220> 

<221> sig_peptide 

<222> 132 . .206 

<223> Von Heijne matrix 

score 11.1130239236827 
35 seq LALLLTSTPEALG/AN 

<400> 30 

ctccccttcc cgctcccagg aacccatcca gectcaggaa ctgcccccag ccatcgagcc 60 
ttggctactt aagggacctg ggcccaatcc acagctggga cagtcctggc ccactgcact 120 

40 gggaatctag g atg ggg gee ttg gec aga gec ctg ccg tec ata ctg ctg 170 
Met Gly Ala Leu Ala Arg Ala Leu Pro Ser He Leu Leu 
-25 -20 -15 

gca ttg ctg ctt acg tec acc cca gag get ctg ggt gec aac ccc ggc 218 
Ala Leu Leu Leu Thr Ser Thr Pro Glu Ala Leu Gly Ala Asn Pro Gly 

45 -10 -5 1 

ttg gtc gec agg ate acc gac aag gga ctg cag tat gcg gec cag gag 266 

Leu Val Ala Arg He Thr Asp Lys Gly Leu Gin Tyr Ala Ala Gin Glu 

5 10 15 20 

ggg eta ttg get ctg cag agt gag ctg etc agg ate acg ctg cct gac 314 

50 Gly Leu Leu Ala Leu Gin Ser Glu Leu Leu Arg He Thr Leu Pro Asp 

25 30 35 

ttc acc ggg gac ttg agg ate ccc cac gtc ggc cgt ggg cgc tat gag 362 
Phe Thr Gly Asp Leu Arg He Pro His Val Gly Arg Gly Arg Tyr Glu 
40 45 50 

55 ttc cac age ctg aac ate cac age tgt gag ctg ctt cac tct gcg ctg 410 
Phe His Ser Leu Asn He His Ser Cys Glu Leu Leu His Ser Ala Leu 

55 60 65 

agg cct gtc cct ggc cag ggc ctg agt etc age ate tec gac tec tec 458 
Arg Pro Val Pro Gly Gin Gly Leu Ser Leu Ser He Ser Asp Ser Ser 

60 70 75 80 

ate egg gtc cag ggc agg tgg aag gtg cgc aag tea ttc ttc aaa eta 506 
He Arg Val Gin Gly Arg Trp Lys Val Arg Lys Ser Phe Phe Lys Leu 
85 90 95 100 

33 
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10 



999 ctg cag 
Gly Leu Gin 

tac atg aga 
Tyr Met Arg 
455 
ctgcttgttg 
ttctgctctc 
caccaggtgc 
ccctccctga 
ctgatcccca 
atgatgacaa 



ate cat aag gac ttc ctg ttc ttg ggt gec aat gtc caa 
lie His Lys Asp Phe Leu Phe Leu Gly Ala Asn Val Gin 
440 445 450 

gtt tgaggacaag aaagatgaag cttggaggtc acagctggat 
Val 



catttccagc 
agetcegggg 
atgcatgccc 
ctggcctggg 
tgcctagcag 
aaaaaaaaaa 



tgtgcagcac 
gtgaggtgtg 
tctctgagtc 
atatctttac 
agtgctggca 



gtctcagaga 
cctggcctct 
tggactttgc 
aagcaggcac 
cttagtaggt 



ttcttgaaga 
gcctccaccc 
ttcccctcca 
tgtatttttt 
cctcaataaa 



atgaagacat 
tcctcctctt 
ggagggacca 
tattcgecat 
tatttattaa 



1562 



1614 



1674 
1734 
1794 
1854 
1914 
1938 



<210> 31 
15 <211> 1116 
<212> DNA 
<213> Homo sapiens 



<220> 
20 <221> CDS 
<222> 117. 



545 



<220> 

<221> sig_peptide 
25 <222> 117 . .245 

<223> Von Heijne matrix 

score 5.65876793443964 

seq WSFALIATLVYA/LF 

30 <400> 31 

ataaggggac gtctagtggg ttgcccggga ggggtggcgg 
tgtcctctgt cgccgggaac tggcgaggta gttccttcgc 

gec aaa tat caa ggt gaa gtt caa agt ttg aaa 

35 Ala Lys Tyr Gin Gly Glu Val Gin Ser Leu Lys 
-40 -35 
gtt ata gaa gga gta age gac caa gta ctt gtg 
Val lie Glu Gly Val Ser Asp Gin Val Leu Val 
-25 -20 

40 ttc get ttg att get acc ctg gta tat gca ctt 
Phe Ala Leu lie Ala Thr Leu Val Tyr Ala Leu 
-10 -5 1 

caa aac att cac cca gaa aac cag gag eta gta 
Gin Asn lie His Pro Glu Asn Gin Glu Leu Val 

45 10 15 

cag ctt caa aca gaa cag gat gca cct get gec 
Gin Leu Gin Thr Glu Gin Asp Ala Pro Ala Ala 

25 30 
tac act gac atg tac tgt ccc ate tgc ctg cac 

50 Tyr Thr Asp Met Tyr Cys Pro lie Cys Leu His 
40 45 
gtg gag acc aac tgt gga cat ctt ttt tgt ggt 
Val Glu Thr Asn Cys Gly His Leu Phe Cys Gly 
55 60 65 

55 tac tgg cga tat ggt tea tgg ctt ggg gca ate 
Tyr Trp Arg Tyr Gly Ser Trp Leu Gly Ala lie 

75 80 
aga caa acg aga cat ggc cac att gca ttg tec 
Arg Gin Thr Arg His Gly His lie Ala Leu Ser 

60 90 95 

tagaccatga cagttagcat cgaagccacc tgaggaggga 
cagtatttgg tgaagatgat cagtctcagg atgttctgag 
attataaccg gagattctca gggcaaccca gatctgtaag 

35 



gageggtect ggaaataatc 
ggtggagaga cctgga atg 

Met 

ctg gat gat gat tea 
Leu Asp Asp Asp Ser 
-30 

gca gtt gtg gtc agt 
Ala Val Val Val Ser 
-15 

ttc aga aat gta cat 
Phe Arg Asn Val His 
5 

^99 9ta ctt cga gaa 
Arg Val Leu Arg Glu 
20 

act cga cag cag ttc 
Thr Arg Gin Gin Phe 
35 

caa gee tec ttc ccg 
Gin Ala Ser Phe" Pro 
50 

gee tgc att att get 
Ala Cys lie lie Ala 
70 

agt tgt cca ate tgt 
Ser Cys Pro lie Cys 
85 

aga aca get 
Arg Thr Ala 
100 

ggcagtaacc ttactcctaa 
attgeatcag gatattaatg 
taatgctaaa gcatgttcaa 



60 
119 

167 



215 



263 



311 



359 



407 



455 



503 



545 



605 
665 
725 
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agttagagga 
tgactatgag 
aggccaaggc 
gaaaccccat 
agtccctgct 
atagtgagcc 
aaaaacaaga 



agacacattt 
aaatcttggc 
aggtggatca 
ctctagaaaa 
tcttgggagg 
atgatcgcac 
aaagaaaaaa 



cttctctttt 
caggcacagt 
cttgagccca 
aacaccaaaa 
ctgaaatggg 
tattgcactc 
aaaaaaaaaa 



gtaaagtgag 
agctaacgcc 
ggagtttgag 
aattggacaa 
aggatcacct 
ccacctgggt 
a 



gtttaccaac 
tataatccta 
accagccttg 
gagtgttggc 
gagcccagga 
ggcagtgaga 



aagtattctt 
gcactttggg 
gaaacatgat 
acatgcctgt 
ggttgaggct 
cccttcctca 



785 
845 
905 
965 
1025 
1085 
1116 



<210> 32 
10 <211> 1114 
<212> DNA 

<213> Homo sapiens 



<220> 
15 <221> CDS 
<222> 117. 



.362 



<400> 32 

ataaggggac gtctagtggg ttgcccggga ggggtggcgg gagcggtcct ggaaataatc 60 
20 tgtcctctgt cgccgggaac tggcgaggta gttccttcgc ggtggagaga cctgga atg 119 

Met 
1 

gcc aaa tat caa.ggt gaa gtt caa agt ttg aaa ctg gat gat gat tea 167 
Ala Lys Tyr Gin Gly Glu Val Gin Ser Leu Lys Leu Asp Asp Asp Ser 

25 * 5 10 15 

gtt ata gaa gga gta age gac caa gta ctt gtg gca gtt gtg gtc agt 215 
Val He Glu Gly Val Ser Asp Gin Val Leu Val Ala Val Val Val Ser 

20 25 30 

ttc get ttg att get ace ctg gta tat gca ctt ttc aga aat gta cat 263 

30 Phe Ala Leu He Ala Thr Leu Val Tyr Ala Leu Phe Arg Asn Val His 
35 40 45 

caa aac att cac cca gaa aac cag gag eta gta agg gta ctt cga gaa 311 
Gin Asn He His Pro Glu Asn Gin Glu Leu Val Arg Val Leu Arg Glu 
50 55 60 65 

35 cag ctt caa aca gaa cag gat gca cct get gac teg aca gca gtt eta 359 
Gin Leu Gin Thr Glu Gin Asp Ala Pro Ala Asp Ser Thr Ala Val Leu 

70 75 80 

cac tgacatgtac tgtcccatct gcctgcacca agcctccttc ccggtggaga 412 
His 

40 ccaactgtgg acatcttttt tgtggtgcct geattattge ttactggcga tatggttcat 472 
ggcttggggc aatcagttgt ccaatctgta gacaaacgag acatggccac attgeattgt 532 
ccagaacagc ttagaccatg acagttagca tcgaagccac ctgaggaggg aggcagtaac 592 
cttactccta acagtatttg gtgaagatga tcagtctcag gatgttctga gattgeatea 652 
ggatattaat gattataacc ggagattctc agggcaaccc agatctgtaa gtaatgctaa 712 

45 agcatgttca aagttagagg aagacacatt tcttctcttt tgtaaagtga ggtttaccaa 772 
caagtattct ttgactatga gaaatcttgg ccaggcacag tagctaaege ctataatcct 832 
agcactttgg gaggecaagg caggtggatc acttgagccc aggagtttga gaccagcctt 892 
ggaaacatga tgaaacccca tctctagaaa aaacaccaaa aaattggaca agagtgttgg 952 
cacatgcctg tagtccctgc ttcttgggag gctgaaatgg gaggatcacc tgageccagg 1012 

50 aggttgaggc tatagtgagc catgatcgea etattgeact cccacctggg tggcagtgag 1072 
acccttcctc aaaaaacaag aaaagaaaaa aaaaaaaaaa aa 1114 

<210> 33 

<211> 2072 

55 <212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
60 <222> 144. 



1262 



<220> 

<221> sigjpeptide 



36 
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<222> 144 . .224 

<223> Von Heijne matrix 

score 5.14258625256317 

seq FLCQRLVLSTLSG/RP 

5 

<400> 33 

acgtggacgc gtctgggctg ctggaggcag cccgagccgc cgccgtcggt gtcgccgcca 60 
ccaccaccat cggagtcacg agtcccgcgt ctgtccgaag tcgccgctct cgggctgctc 120 
acgtctcttc ggagagcgcg cac atg gcg act cag gcg cac tec etc age tac 173 
10 Met Ala Thr Gin Ala His Ser Leu Ser Tyr 

-25 -20 
gca ggg tgc aac ttc ttg tgc caa cgt ctg gtc ctg tct ace ctg age 221 
Ala Gly Cys Asn Phe Leu Cys Gin Arg Leu Val Leu Ser Thr Leu Ser 
-15 -10 -5 

15 ggg cgc ccc gtc aaa ate cga aag att egg gee aga gac gac aac ccg 269 
Gly Arg Pro Val Lys lie Arg Lys lie Arg Ala Arg Asp Asp Asn Pro 

1 5 10 15 

ggc etc cga gat ttt gaa gec age ttc ata agg eta ttg gac aaa ata 317 
Gly Leu Arg Asp Phe Glu Ala Ser Phe lie Arg Leu Leu Asp Lys lie 
20 2 0 25 30 

acg aat ggt tct cga att gaa ata aac caa aca gga aca ace tta tat 365 
Thr Asn Gly Ser Arg lie Glu lie Asn Gin Thr Gly Thr Thr Leu Tyr 

35 40 45 

tat cag cct ggc etc ctg tat ggt gga tct gtg gaa cat gac tgt age 413 
25 Tyr Gin Pro Gly Leu Leu Tyr Gly Gly Ser Val Glu His Asp Cys Ser 
50 55 60 

gtc ctt cgt ggc att ggg tat tac ctg gag agt ctt ctt tgc ttg get 461 
Val Leu Arg Gly lie Gly Tyr Tyr Leu Glu Ser Leu Leu Cys Leu Ala 
65 70 75 

30 cca ttt atg aag cac ccg tta aaa ata gtt eta cga gga gtg ace aat 509 
Pro Phe Met Lys His Pro Leu Lys lie Val Leu Arg Gly Val Thr Asn 
80 85 90 95 

gat cag att gac cct tea gtt gat gtt ctt aag gca aca gca etc cct 557 
Asp Gin lie Asp Pro Ser Val Asp Val Leu Lys Ala Thr Ala Leu Pro 
35 100 105 110 

ttg ttg aaa caa ttt ggg att gat ggt gaa tea ttt gaa ctg aag att 605 
Leu Leu Lys Gin Phe Gly lie Asp Gly Glu Ser Phe Glu Leu Lys lie 

115 120 125 

gtg cga egg gga atg cct ccc gga gga gga ggc gaa gtg gtt ttc tea 653 
40 Val Arg Arg Gly Met Pro Pro Gly Gly Gly Gly Glu Val Val Phe Ser 
130 135 140 

tgt cct gtg agg aag gtc ttg aag ccc att caa etc aca gat cca gga 701 
Cys Pro Val Arg Lys Val Leu Lys Pro lie Gin Leu Thr Asp Pro Gly 
145 150 155 

45 aaa ate aaa cgt att aga gga atg gcg tac tct gta cgt gtg tea cct 749 
Lys lie Lys Arg lie Arg Gly Met Ala Tyr Ser Val Arg Val Ser Pro 
160 165 170 175 

cag atg gcg aac egg att gtg gat tct gca agg age ate etc aac aag 797 
Gin Met Ala Asn Arg lie Val Asp Ser Ala Arg Ser lie Leu Asn Lys 
50 180 185 190 

ttc ata cct gat ate tat att tac aca gat cac att aaa gga gtc aac 845 
Phe lie Pro Asp lie Tyr lie Tyr Thr Asp His lie Lys Gly Val Asn 

195 200 205 

tct ggg aag tct ccg ggc ttt ggg ttg tea ctg gtt get gag acc acc 893 
55 Ser Gly Lys Ser Pro Gly Phe Gly Leu Ser Leu Val Ala Glu Thr Thr 
210 215 220 

a 9 fc 99C acc ttc etc agt get gaa ctg gec tec aac ccc cag ggc cag 941 
Ser Gly Thr Phe Leu Ser Ala Glu Leu Ala Ser Asn Pro Gin Gly Gin 
225 230 235 

60 gga gca gca gta ctt cca gag gac ctt ggc agg aac tgt gee egg ctg 989 
Gly Ala Ala Val Leu Pro Glu Asp Leu Gly Arg Asn Cys Ala Arg Leu 
240 245 250 255 

ctg ctg gag gaa ate tac agg ggt gga tgc gta gac teg acc aac caa 1037 
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Leu Leu Glu 



age 
Ser 

5 

gtc 
Val 

ttg 
10 Leu 

ggt 
Gly 
320 
15 att 
He 



ctg gcg 
Leu Ala 

ctg eta 
Leu Leu 
290 
aag age 
Lys Ser 
305 

gaa gaa 
Glu Glu 

ggt ttc 
Gly Phe 



ggccccagtg 
atggattaat 

20 acaaataaaa 
ttgcccagga 
gccttcaggc 
ctcagcctac 
ggagtctgtt 

25 gggattaaat 
gtcagaaatt 
acacggacac 
gggctccaag 
tattttcaca 

30 ttttgttcat 
aaaaaaaaaa 



Glu He Tyr Arg Gly Gly Cys Val 

260 265 
eta eta etc atg ace ctt gga cag 
Leu Leu Leu Met Thr Leu Gly Gin 
275 280 
ggc cct etc tct ccc tac acg ata 
Gly Pro Leu Ser Pro Tyr Thr He 
2 95 

ttt ttc cag att atg ttt aaa att 
Phe Phe Gin He Met Phe Lys He 
310 

etc aag ggt ggg gat aaa gtg ctg 
Leu Lys Gly Gly Asp Lys Val Leu 
325 330 
tec aac ctt age agg ace etc aag 
Ser Asn Leu Ser Arg Thr Leu Lys 
340 345 
cctacagaca aagcagaagc tgccacggac 
ccaggacaga atagecaett gcttaatttt 
gacatccctg tagcatatgg tttccagctg 
ggggcccagt caccatgaga gctcccttgc 
cacagtegtg ctgetagaac agtctegtag 
tatcataggc ttcctcagcc ctctgtcata 
actgttcttt ctgeaaggae tcacctcctt 
gagataatat gagtggcagc tcttcatgag 
ggtgtattag actatttatc tttgatcttc 
ggatcttcat ctggttcatt gtatttatat 
taagttattg ggatgttttt atattccagg 
atagctctgt gatgtaagtg ctatctccat 
ttgaaatgta taatgtaaag acattaaatc 



Ser 


Thr 


Asn 


Gin 








270 






gat 


gtt 


tec 


aaa 


1085 


Asp 


Val 


Ser 


Lys 






285 








t- *- i- 




egg 




113 3 


Phe 


Leu 


Arg 


His 




300 










ace 


aag 


cca 


tgt 


1181 


Thr 


Lys 


Pro 


Cys 




ace 


tgt 


gtt 


ggc 


1229 


Thr 


Cys 


Val 


Gly 





cag 
Gin 

gaa 
Glu 

gaa 
Glu 
315 
atg 
Met 

335 

tgataaccat cacaagataa 



accaatggga 
ctgtgaagaa 
tttctccagt 
cttacctgga 
ctgcagttca 
tggctgtttt 
gagccttggt 
tectgeagtg 
tgaatggatt 
gtgagggatg 
tgtgctgtac 
gagaaaattc 
tcctcattta 



ccaagtccaa 
atatcaatat 
ggcattgeca 
ggaagaatgt 
gctgtgcttc 
gcaaacctgt 
ttttgttgta 
ctaagcaaat 
gctgtcatgg 
gatggctgcg 
gttcttattt 
ataaagggtg 
aggaaaaaaa 



1282 



1342 
1402 
1462 
1522 
1582 
1642 
1702 
1762 
1822 
1882 
1942 
2002 
2062 
2072 



<210> 34 

<211> 409 

35 <212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

40 <222> 35 . .316 

<220> 

<221> sig_peptide 

<222> 35. . 109 

45 <223> Von Heijne matrix 

score 5.38058532480537 
seq AVTSLLSPTPATA/LA 



<400> 34 

50 tttttttcga gaeeggaagt gagtgatcga 



55 



ctg agg 
Leu Arg 

aca get 
Thr Ala 



aaa aac 
60 Lys Asn 
15 

atg gaa 
Met Glu 



ace 
Thr 

ctt 
Leu 
1 

etc 
Leu 

ggt 
Gly 



egg aca gee 
Arg Thr Ala 
-15 

get gtc aga 
Ala Val Arg 

ggt gga aag 
Gly Gly Lys 
20 

cac tat gtt 
His Tyr Val 



aagc atg gcg teg gtg gtg ttg gcg 55 
Met Ala Ser Val Val Leu Ala 
-25 -20 
gtt aca tec ttg eta age ccc act ccg get 103 
Val Thr Ser Leu Leu Ser Pro Thr Pro Ala 

-10 -5 
tac gca tec aag aag teg ggt ggt age tec 151 
Tyr Ala Ser Lys Lys Ser Gly Gly Ser Ser 
5 10 

tea tea ggc aga cgc caa ggc att aag aaa 199 
Ser Ser Gly Arg Arg Gin Gly He Lys Lys 

25 30 
cat get ggg aac ate att gca aca cag cgc 247 
His Ala Gly Asn He He Ala Thr Gin Arg 
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10 



15 



35 40 45 

cat ttc cgc tgg cac cca ggt gcc cat gtg agt tgc tec gtt get gec 295 
His Phe Arg Trp His Pro Gly Ala His Val Ser Cys Ser Val Ala Ala 

50 55 60 

ccc ctt ttt cct ttt eta ggt tgacctctcc ttgcccctaa gcatggtaat 346 
Pro Leu Phe Pro Phe Leu Gly 
65 

aacagttgea tgtattgagt gcttaccaaa tggcaagcat tgtgccaaaa aaaaaaaaaa 406 
aaa ~ ~ 409 

<210> 35 
<211> 836 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 177. .767 

20 <220> 

<221> sig_peptide 

<222> 177 . .236 

<223> Von Heijne matrix 

score 6.51720597568932 

25 seq laviltllglail/ai 

<400> 35 

aatctgctcc aegcaattte tcagtgatcc tctgcatctc tgcctacaag ggcctccctg 60 
acacccaagt teatattget cagaaacagt gaacttgagt ttttcatttt accttgatct 120 
30 ctctctgaca aagaaatcca gatgatgega gacctgatga agacaataca tggaaa atg 179 

Met 
-20 

aca gtc ttg gaa ata act ttg get gtc ate ctg act eta ctg gga ctt 227 
Thr Val Leu Glu lie Thr Leu Ala Val lie Leu Thr Leu Leu Gly Leu 

35 -15 -10 -5 

gcc ate ctg get att ttg tta aca aga tgg gca cga cgt aag caa agt 275 
Ala lie Leu Ala lie Leu Leu Thr Arg Trp Ala Arg Arg Lys Gin Ser 

15 10 
gaa atg tat ate tec aga tac agt tea gaa caa agt get aga ctt ctg 323 

40 Glu Met Tyr lie Ser Arg Tyr Ser Ser Glu Gin Ser Ala Arg Leu Leu 
15 20 25 

gac tat gag gat ggt aga gga tec cga cat gca tat tea aca caa agt 371 
Asp Tyr Glu Asp Gly Arg Gly Ser Arg His Ala Tyr Ser Thr Gin Ser 
30 35 40 45 

45 gag aga tec aaa aga gat tac aca cca tea acc aac tct eta gca ctg 419 
Glu Arg Ser Lys Arg Asp Tyr Thr Pro Ser Thr Asn Ser Leu Ala Leu 

5 0 55 60 

tct cga tea agt att get tta cct caa gga tec atg agt agt ata aaa 467 
Ser Arg Ser Ser lie Ala Leu Pro Gin Gly Ser Met Ser Ser lie Lys 

50 65 70 75 

tgt tta caa aca act gaa gaa cct cct tec aga act gca gga gcc atg 515 
Cys Leu Gin Thr Thr Glu Glu Pro Pro Ser Arg Thr Ala Gly Ala Met 

80 85 90 

atg caa ttc aca gcc cct att ccc gga get aca gga cct ate aag etc 563 

55 Met Gin Phe Thr Ala Pro lie Pro Gly Ala Thr Gly Pro lie Lys Leu 
95 100 105 

tct caa aaa acc att gtg caa act eta gga cct att gta caa tat cct 611 
Ser Gin Lys Thr lie Val Gin Thr Leu Gly Pro lie Val Gin Tyr Pro 
110 115 120 125 

60 gga tec aat ggg agg ata aac ata age cag etc acc tea gag gat etc 659 
Gly Ser Asn Gly Arg lie Asn lie Ser Gin Leu Thr Ser Glu Asp Leu 

130 135 140 

act ggg get aaa gga agg gtc aca tct ggt cca cag ttc cct aat age 707 

39 
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10 



15 



Thr Gly Ala Lys Gly Arg Val Thr Ser Gly Pro Gin Phe Pro Asn Ser 

145 150 155 

cac cat gtg cca gag aat eta cat gga tac atg aat tec ctt tec ctt 755 
His His Val Pro Glu Asn Leu His Gly Tyr Met Asn Ser Leu Ser Leu 

160 165 170 

ttc tec cct get tgactccctc tcccttatgt gtaaacaatt taaaaatatg 807 
Phe Ser Pro Ala 
175 

atagtgtata aatgaaaaaa aaaaaaaaa 836 

<210> 36 
<211> 1323 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 208. .1239 

20 <220> 

<221> sig_peptide 

<222> 208. .294 

<223> Von Heijne matrix 

score 5.73027134157378 
25 seq GLVLICVCSKTHS/LK 

<400> 36 

agtctegtat cgcgcccggg aggegcegga gcccagcggc tggegecaga tccaggctcc 60 
tggaagaacc atgtccggca gctactggtc atgecaggea cacactgctg cccaagagga 120 
30 gctgctgttt gaattatctg tgaatgttgg gaagaggaat gecagagctg ccggctgaaa 180 
attacccaac caagagaaat ctgeagg atg gac ttt ctg gtc etc ttc ttg ttc 234 

Met Asp Phe Leu Val Leu Phe Leu Phe 
-25 

tac ctg get teg gtg ctg atg ggt ctt gtt ctt ate tgc gtc tgc teg 282 

35 Tyr Leu Ala Ser Val Leu Met Gly Leu Val Leu lie Cys Val Cys Ser 
-20 -15 -10 -5 

aaa ace cat age ttg aaa ggc ctg gee agg gga gga gca cag ata ttt 330 
Lys Thr His Ser Leu Lys Gly Leu Ala Arg Gly Gly Ala Gin lie Phe 
15 10 

40 tec tgt ata att cca gaa tgt ctt cag aga gee gtg cat gga ttg ctt 378 
Ser Cys lie lie Pro Glu Cys Leu Gin Arg Ala Val His Gly Leu Leu 

15 20 25 

cat tac ctt ttc cat acg aga aac cac ace ttc att gtc ctg cac ctg 426 
His Tyr Leu Phe His Thr Arg Asn His Thr Phe lie Val Leu His Leu 

45 3 0 3 5 4 0 

gtc ttg caa ggg atg gtt tat act gag tac ace tgg gaa gta ttt ggc 474 

Val Leu Gin Gly Met Val Tyr Thr Glu Tyr Thr Trp Glu Val Phe Gly 

45 50 55 60 

tac tgt cag gag ctg gag ttg tec ttg cat tac ctt ctt ctg ccc tat 522 

50 Tyr Cys Gin Glu Leu Glu Leu Ser Leu His Tyr Leu Leu Leu Pro Tyr 

65 70 75 

ctg ctg eta ggt gta aac ctg ttt ttt ttc ace ctg act tgt gga ace 570 
Leu Leu Leu Gly Val Asn Leu Phe Phe Phe Thr Leu Thr Cys Gly Thr 
80 85 90 

55 aat cct ggc att ata aca aaa gca aat gaa tta tta ttt ctt cat gtt 618 
Asn Pro Gly lie lie Thr Lys Ala Asn Glu Leu Leu Phe Leu His Val 

95 100 105 

tat gaa ttt gat gaa gtg atg ttt cca aag aac gtg agg tgc tct act 666 
Tyr Glu Phe Asp Glu Val Met Phe Pro Lys Asn Val Arg Cys Ser Thr 

60 110 115 120 

tgt gat tta agg aaa cca get cga tec aag cac tgc agt gtg tgt aac 714 
Cys Asp Leu Arg Lys Pro Ala Arg Ser Lys His Cys Ser Val Cys Asn 
125 130 135 140 

40 
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tgg tgt gtg cac cgt ttc gac cat cac tgt gtt tgg gtg aac aac tgc 762 

Trp Cys Val His Arg Phe Asp His His Cys Val Trp Val Asn Asn Cys 

145 150 155 

ate ggg gec tgg aac ate agg tac ttc etc ate tac gtc ttg ace ttg 810 

5 He Gly Ala Trp Asn He Arg Tyr Phe Leu He Tyr Val Leu Thr Leu 

160 165 170 

acg gee teg get gee ace gtc gee att gtg age acc act ttt ctg gtc 858 

Thr Ala Ser Ala Ala Thr Val Ala He Val Ser Thr Thr Phe Leu Val 

175 180 185 

10 cac ttg gtg gtg atg tea gat tta tac cag gag act tac ate gat gac 906 

His Leu Val Val Met Ser Asp Leu Tyr Gin Glu Thr Tyr He Asp Asp 

190 195 200 

ctt gga cac etc cat gtt atg gac acg gtc ttt ctt att cag tac ctg 954 

Leu Gly His Leu His Val Met Asp Thr Val Phe Leu He Gin Tyr Leu 

15 205 210 215 220 

ttc ctg act ttt cca egg att gtc ttc atg ctg ggc ttt gtc gtg gtt 1002 

Phe Leu Thr Phe Pro Arg He Val Phe Met Leu Gly Phe Val Val Val 

225 230 235 

ctg age ttc etc ctg ggt ggc tac ctg ttg ttt gtc ctg tat ctg gcg 1050 

20 Leu Ser Phe Leu Leu Gly Gly Tyr Leu Leu Phe Val Leu Tyr Leu Ala 

240 245 250 

gee acc aac cag act act aac gag tgg tac aga ggt gac tgg gee tgg 1098 

Ala Thr Asn Gin Thr Thr Asn Glu Trp Tyr Arg Gly Asp Trp Ala Trp 

255 260 265 

25 tgc cag cgt tgt ccc ctt gtg gee tgg cct ccg tea gca gag ccc caa 114 6 

Cys Gin Arg Cys Pro Leu Val Ala Trp Pro Pro Ser Ala Glu Pro Gin 

270 275 280 

gtc cac egg aac att cac tec cat ggg ctt egg age aac ctt caa gag 1194 

Val His Arg Asn He His Ser His Gly Leu Arg Ser Asn Leu Gin Glu 

30 285 290 295 300 

ate ttt eta cct gee ttt cca tgt cat gag agg aag aaa caa gaa 1239 

He Phe Leu Pro Ala Phe Pro Cys His Glu Arg Lys Lys Gin Glu 

305 310 315 
tgacaagtgt atgactgect ttgagctgta gttcccgttt atttacacat gtggatcctc 1299 
35 gttttccaaa aaaaaaaaaa aaaa 1323 

<210> 37 
<211> 1945 
<212> DNA 
40 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 60. .1682 

45 

<220> 

<221> sig_peptide 

<222> 60. .143 

<223> Von Hei j ne matrix 

50 score 3.75144398608723 
seq SGLLLQVLFRLIT/FV 

<400> 37 

ategegacta aacggagtgg eggeggcatt tcctggtgtc tgagcctggc geggagget 59 
55 atg ggc age cag gag gtg ctg ggc cac gcg gee egg ctg tec tec tec 107 
Met Gly Ser Gin Glu Val Leu Gly His Ala Ala Arg Leu Ser Ser Ser 

-25 -20 -15 

ggt etc etc ctg cag gtg ttg ttt egg ttg ate acc ttt gtc ttg aat 155 
Gly Leu Leu Leu Gin Val Leu Phe Arg Leu He Thr Phe Val Leu Asn 
60-10 -5 1 

gca ttt att ctt cgc ttc ctg tea aag gaa ate gtt ggc gta gta aat 203 
Ala Phe He Leu Arg Phe Leu Ser Lys Glu He Val Gly Val Val Asn 
5 10 15 20 
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gta 


aga 


eta 


acg 




Val 


Arg 


Leu 


Thr 




gcc 


ttc 


cgc 


aga 


5 


Ala 


Phe 


Arg 


Arg 










40 




cag 


acc 


etc 


aac 




Gin 


Thr 


Leu 


Asn 








55 




10 


tec 


tta 


ttc 


ctg 




Ser 


Leu 


Phe 


Leu 






70 








cct 


aat 


gtt 


gtc 




Pro 


Asn 


Val 


Val 


15 


85 










teg 


gca 


gtg 


gtg 




Ser 


Ala 


Val 


Val 




gca 


cat 


atg 


ttt 


20 


Ala 


His 


Met 


Phe 










120 




att 


ctt 


aag 


acc 




He 


Leu 


Lys 


Thr 








135 




25 


tgg 


gga 


ttg 


tac 




Trp 


Gly 


Leu 


Tyr 






150 








ctg 


gtg 


etc 


tgc 




Leu 


Val 


Leu 


Cys 


30 


165 










gaa 


tea 


acc 


aag 




Glu 


Ser 


Thr 


Lys 




tta 


ccc 


aat 


att 


35 


Leu 


Pro 


Asn 


He 










200 




aaa 


ctg 


act 


tgg 




Lys 


Leu 


Thr 


Trp 








215 




40 


aca 


gaa 


gg c 


gag 




Thr 


Glu 


Gly 


Glu 






230 








ggt 


gat 


cag 


ggt 




Gly 


Asp 


Gin 


Gly 


45 


245 










gcc 


aga 


tta 


att 




Ala 


Arg 


Leu 


He 




get 


aag 


gtg 


ctg 


50 


Ala 


Lys 


Val 


Leu 










280 




gac 


gtt 


get 


gtg 




Asp 


Val 


Ala 


val 








295 




55 


ctg 


ctg 


gcc 


ggc 




Leu 


Leu 


Ala 


Gly 






310 








ctg 


get 


ctg 


gat 




Leu 


Ala 


Leu 


Asp 


60 


325 










cct 


gtt 


ttg 


ctg 




Pro 


Val 


Leu 


Leu 



ctg ctt tac tea 
Leu Leu Tyr Ser 
25 

gca tgt etc agt 
Ala Cys Leu Ser 

ctg ctg tgg eta 
Leu Leu Trp Leu 
60 

ggc tgg ate tgg 
Gly Trp He Trp 
75 

cct cac tat gca 
Pro His Tyr Ala 
90 

gag ctt eta gga 
Glu Leu Leu Gly 
105 

gtg aag etc aag 
Val Lys Leu Lys 

gtt ctg aca get 
Val Leu Thr Ala 
140 

att ttc tct ttg 
He Phe Ser Leu 
155 

tat gtt att tat 
Tyr Val He Tyr 
17 0 

ctt caa act ctt 
Leu Gin Thr Leu 
185 

aca aga aat gga 
Thr Arg Asn Gly 

agt ttt ttc aaa 

Ser Phe Phe Lys 
220 

cga tat gtg atg 

Arg Tyr Val Met 
235 

gtg tat gat ata 

Val Tyr Asp He 
250 

ttc cag cca ata 

Phe Gin Pro He 
265 

gag agg gga aag 

Glu Arg Gly Lys 

get get gca gtc 

Ala Ala Ala Val 
300 

ctg acc ate act 

Leu Thr He Thr 
315 

ate tac gga ggg 

He Tyr Gly Gly 
330 

cgt tec tac tgt 

Arg Ser Tyr Cys 
345 



acc acc etc ttc ctg 
Thr Thr Leu Phe Leu 
30 

ggg gg° sec cag cga 
Gly Gly Thr Gin Arg 
45 

aca gtc ccc ctg ggt 
Thr Val Pro Leu Gly 
65 

ttg cag ctg ctt gaa 
Leu Gin Leu Leu Glu 
80 

act gga gtg gtg ctg 
Thr Gly Val Val Leu 
95 

gag ccc ttt tgg gtc 
Glu Pro Phe Trp Val 
110 

gtg att gca gag age 
Val He Ala Glu Ser 
125 

ttt etc gtg ctg tgg 
Phe Leu Val Leu Trp 
145 

gcc cag ctt ttc tat 
Ala Gin Leu Phe Tyr 
•160 

ttc aca aag tta ctg 
Phe Thr Lys Leu Leu 
175 

cct gtc tec aga ata 
Pro Val Ser Arg He 
190 

gcg ttt ata aac tgg 
Ala Phe He Asn Trp 
205 

cag tct ttc ttg aaa 
Gin Ser Phe Leu Lys 
225 

aca ttt ttg aat gta 
Thr Phe Leu Asn Val 
240 

gtg aat aat ctt ggc 
Val Asn Asn Leu Gly 
255 

gag gaa agt ttt tat 
Glu Glu Ser Phe Tyr 
270 

gat gcc aca ctt cag 
Asp Ala Thr Leu Gin 
285 

ttg gag tec ctg etc 
Leu Glu Ser Leu Leu 
305 

gtt ttt ggc ttt gcc 
Val Phe Gly Phe Ala 
320 

acc atg ctt age tea 
Thr Met Leu Ser Ser 
335 

etc tat gtt etc ctg 
Leu Tyr Val Leu Leu 
350 

42 



gcc aga gag 251 
Ala Arg Glu 
35 

gac tgg age 299 

Asp Trp Ser 

50 

gtg ttt tgg 347 
Val Phe Trp 

gtg cct gat 395 
Val Pro Asp 

ttt ggt etc 443 
Phe Gly Leu 
100 

ttg gca caa 491 
Leu Ala Gin 
115 

ctg teg gta 539 

Leu Ser Val 

130 

ttg cct cac 587 
Leu Pro His 

acc aca gtt 635 
Thr Thr Val 

ggt tec cca 683 
Gly Ser Pro 
180 

aca gat ctg 731 
Thr Asp Leu 
195 

aaa gag get 779 

Lys Glu Ala 

210 

cag att ttg 827 
Gin lie Leu 

ttg aac ttt 875 
Leu Asn Phe 

tec ctt gtg 923 
Ser Leu Val 
260 

ata ttt ttt 971 
He Phe Phe 
275 

aag cag gag 1019 

Lys Gin Glu 

290 

aag ctg gcc 1067 
Lys Leu Ala 

tat tct cag 1115 
Tyr Ser Gin 

gga tec ggt 1163 
Gly Ser Gly 
340 

ctt gcc ate 1211 
Leu Ala He 
355 
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aat gga gtg 
Asn Gly Val 



gtc 
5 Val 

gtg 
Val 

10 ttg 
Leu 
405 
ttc 
Phe 

15 

ctg 
Leu 



20 



25 



30 



35 



gac agg 
Asp Arg 
375 
tta tec 
Leu Ser 
390 

gec aac 
Ala Asn 

ate cac 
lie His 

cac eta 
His Leu 



gtt act get 
Val Thr Ala 
455 

gee aga ctg 
Ala Arg Leu 

470 
etc ggg aca 
Leu Gly Thr 
4 85 

act cag tta 
Thr Gin Leu 

gaagcctgga 
tctgtgtaag 
caccagagag 
ggagtttcat 
aaaaaaaaaa 



aca gag tgt ttc aca ttt get gee 
Thr Glu Cys Phe Thr Phe Ala Ala 
360 365 
tac aat ttt gtg atg ctg gec ctg 
Tyr Asn Phe Val Met Leu Ala Leu 
380 

tat etc ttg ace cgt tgg tgt ggc 
Tyr Leu Leu Thr Arg Trp Cys Gly 
395 

tgc ttt aac atg ggc att egg ate 
Cys Phe Asn Met Gly lie Arg lie 
410 415 
cgc tac tac cga agg age ccc cac 
Arg Tyr Tyr Arg Arg Ser Pro His 

425 430 
teg cca gtc ctg etc ggg aca ttt 
Ser Pro Val Leu Leu Gly Thr Phe 
440 445 
gtt teg gag gta ttc etc tgc tgt 
Val Ser Glu Val Phe Leu Cys Cys 
460 

gca cac att get gtg ggg gec ttc 
Ala His He Ala Val Gly Ala Phe 
475 

gca ttc etc aca gag ace aag ctg 
Ala Phe Leu Thr Glu Thr Lys Leu 
490 495 
ggt gtg ccc aga cgc act gac aaa 
Gly Val Pro Arg Arg Thr Asp Lys 
505 510 
cacccgaggc acctggacca gctatgggta 
agccccactg agggctctgc agcggagtga 
tgccactgca tgagacacct gtgaccattc 
ttttaagtga agaccaaaag ccctttaaaa 
aaa 



atg 
Met 

tec 
Ser 

age 
Ser 
400 
acg 
Thr 

agg 
Arg 

gec 
Ala 

gag 
Glu 

tgt 
Cys 
480 
ate 
He 

atg 
Met 



age 
Ser 

tec 
Ser 
385 
gtg 
Val 

cag 
Gin 

ccc 
Pro 

etc 
Leu 

cag 
Gin 
465 
ctg 
Leu 



aaa 
Lys 
370 
tea 
Ser 

ggc 
Gly 



gag gag 
Glu Glu 

ttc ctg 
Phe Leu 

ttc ate 
Phe He 



age ctt 
Ser Leu 



ctg 
Leu 

agt 
Ser 
450 
ggc 
Gly 



get 
Ala 
435 
ggt 
Gly 

tgg 
Trp 



tgc 
Cys 
420 
ggc 
Gly 

999 
Gly 

cca 
Pro 



gga gca act 
Gly Ala Thr 



cat ttc etc agg 
His Phe Leu Arg 
500 

aca tgacttcagg 
Thr 



gttctgtggg tggaacacat 
cagcaacccc agagatgagg 
gaagtctgaa atgcgggggg 
ataatagttt tttatcaaaa 



1259 



1307 



1355 



1403 



1451 



1499 



1547 



1595 



1643 



1692 



1752 
1812 
1872 
1932 
1945 



<210> 38 

<211> 1330 

<212> DNA 

40 <213> Homo sapiens 



45 



<220> 
<221> CDS 
<222> 198. .998 



<220> 

<221> sig_jpeptide 
<222> 198 . .269 
<223> Von Heijne matrix 
50 score 9.08017839002281 

seq LLLGPGLLATVRA/EC 



<400> 38 

agaaatcagc ectttgeaga gggegcagag ggcctggaaa cctctgggac cttttcccag 60 
55 gaactgttta tggtttcccc ctaggtctag gagaegtaga tgcataggtg gattggatac 120 
atcgatggta gctataagag tcgtgtctga acccggcttt tccaattggc ctgctccatc 180 
egaacagegt caactcc atg gcg egg ttc ctg aca ctt tgc act tgg ctg 230 

Met Ala Arg Phe Leu Thr Leu Cys Thr Trp Leu 
-20 -15 
60 ctg ttg etc ggc ccc ggg etc ctg gcg ace gtg egg gec gaa tgc age 278 
Leu Leu Leu Gly Pro Gly Leu Leu Ala Thr Val Arg Ala Glu Cys Ser 

-10 -5 1 

cag gat tgc gcg acg tgc age tac cgc eta gtg cgc ccg gee gac ate 326 

43 
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Gin Asp Cys 
5 

aac ttc ctg 
Asn Phe Leu 
5 20 

aaa att tgg 
Lys lie Trp 

ctt cct caa 
10 Leu Pro Gin 



gaa 
Glu 

15 gga 
Gly 

gaa 
Glu 
20 100 
atg 
Met 



25 



45 



50 



age cat 
Ser His 

70 
ggc ttc 
Gly Phe 
85 

gag gec 
Glu Ala 

aag aag 
Lys Lys 



ctg eta aaa 
Leu Leu Lys 



cac 
His 

30 ggc 
Gly 

aaa 
Lys 
35 180 
cca 
Pro 



cag 
Gin 

ttc 
Phe 
165 
gag 
Glu 



gat 
Asp 
150 
atg 
Met 

ctg 
Leu 



gag tgg 
Glu Trp 



cgc ttt gec 
40 Arg Phe Ala 



aaa gaa gtt 
Lys Glu Val 
230 
taatattttt 
aaactgttga 
ctggataact 
taagctcagt 
cttgtctctt 
ctattacccc 



Ala Thr Cys Ser Tyr Arg Leu Val Arc 

10 15 
get tgc gta atg gaa tgt gaa ggt aac 
Ala Cys Val Met Glu Cys Glu Gly Lys 

25 30 
gaa acc tgc aag gag etc ctg cag etc 
Glu Thr Cys Lys Glu Leu Leu Gin Lev 

40 45 
gat ggc acc age acc etc aga gaa aat 
Asp Gly Thr Ser Thr Leu Arg Glu Asr 
55 60 
ttg eta gee aaa agg tat ggg ggc ttc 
Leu Leu Ala Lys Arg Tyr Gly Gly Phe 
75 

atg aag aaa atg gat gag ctt tat ccc 
Met Lys Lys Met Asp Glu Leu Tyr Pre 
90 95 
aat gga agt gag ate etc gec aag cgc 
Asn Gly Ser Glu He Leu Ala Lys Arc 

105 110 
gat gca gag gag gac gac teg ctg gc< 
Asp Ala Glu Glu Asp Asp Ser Leu Ale 

120 125 
gag ctt ctg gaa aca ggg gac aac 
Glu Leu Leu Glu Thr Gly Asp Asn 
135 140 
ggc agt gat aat gag gaa gaa gtg 
Gly Ser Asp Asn Glu Glu Glu Val 
155 

aga ggc tta aag aga age ccc caa 
Arg Gly Leu Lys Arg Ser Pro Gin 
170 

cag aag cga tat ggg ggc ttc atg 
Gin Lys Arg Tyr Gly Gly Phe Met 
185 190 
tgg atg gac tac cag aaa egg tat 
Trp Met Asp Tyr Gin Lys Arg Tyr 

200 205 
gag get ctg ccc tec gac gaa gaa 
Glu Ala Leu Pro Ser Asp Glu Glu 
215 220 
cct gaa atg gaa aaa aga tac gga 
Pro Glu Met Glu Lys Arg Tyr Gly 

235 240 
cccactagtg gccccaggcc ccagcaagcc tccctccatc 
tggtgtttta ttgtcatgtg ttgettgect tgtatagttg 
atacaacctg aaaactgtca tttcaggttc tgtgctcttt 
attagtctat tgeagctate tegttttcat gctaaaatag 
atttttgaca aacatcaata aatgettact tgtatataga 
aagtgcaaaa aaaaaaaaaa aa 



cga 
Arg 

age 
Ser 

ctg 
Leu 
175 
aga 
Arg 



Pro 


Ala 


Asp 


He 


ctg 


cct 


tct 


ctg 


Leu 


Pro 


Ser 


Leu 








35 


tec 


aaa 


cca 


gat 


Ser 


Lys 


Pro 


Asp 






50 




age 


aaa 


ccg 


gaa 


Ser 


Lys 


Pro 


Glu 




65 






ct l_ y 


dad 


agg 


f- -a 4- 


Met 


Lys 


Arg 


Tyr 


80 








atg 


gag 


cca 


gaa 


Met 


Glu 


Pro 


Glu 


tat 


ggg 


ggc 


ttc 


Tyr 


Gly 


Gly 


Phe 








115 


aat 


tec 


tea 


gac 


Asn 


Ser 


Ser 


Asp 






130 




gag 


cgt 


age 


cac 


Glu 


Arg 


Ser 


His 




145 






a ag 


aga 


L. d L. 


ggg 


Lys 


Arg 


Tyr 


Gly 


160 








gaa 


gat 


gaa 


gee 


Glu 


Asp 


Glu 


Ala 


aga 


gta 


ggt 


cgc 


Arg 


Val 


Gly 


Arg 








195 


ggt 


ttc 


ctg 


aag 


Gly 


Phe 


Leu 


Lys 






210 




gaa 


agt 


tac 


tec 


Glu 


Ser 


Tyr 


Ser 




225 






ttt 


atg 


aga 


ttt 


Phe 


Met 


Arg 


Phe 



ctccagtggg 
acttcattgt 
ttggagtctt 
tttttgttat 
gataataaac 



374 



422 



470 



518 



566 



614 



662 



710 



758 



806 



854 



902 



950 



998 



1058 
1118 
1178 
1238 
1298 
1330 



<210> 39 
<211> 2124 
<212> DNA 
55 <213> Homo sapiens 



60 



<220> 
<221> CDS 
<222> 505. 



.1590 



<220> 

<221> sig^ peptide 
<222> 505 . .624 



44 
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<223> Von.Heijne matrix 

score 8.5056444915604 
seq WMLMLLTLLVLG/MV 

5 <400> 39 

cctggcataa ctgataggca tgtatgggag gaccacattc ctggggacag cctgggtatg 60 
tgacatggca ggtgaccagg ttcccatgaa tgcccgaggc tgtgcccatc ccatgagctg 120 
gggcttccct ggaggtaaag agctagggtg gggtggcagt gggtagaacc ccagctggac 180 
agctccttcc ttagctctgt gattgctaca gctggttctg gaagccacag gcgccctcag 240 

10 gacaaatggg gcttcttcag cacagggtag tgagtgctga gctaagcaag gacactgtcc 3 00 
ccttctctgc ccaggctcga gctgtgcacc tttaccctgg caattgccct gggtgctgtc 360 
ctgctcctgc ccttctccat catcagcaat gaggtgctgc tctccctgcc tcggaactac 420 
tacatccagt ggctcaacgg ctccctcatc catggcctct ggaaccttgt ttttctcttc 480 
tccaacctgt ccctcatctt cctc atg ccc ttt gca tat ttc ttc act gag 531 

15 Met Pro Phe Ala Tyr Phe Phe Thr Glu 

-40 -35 
tct gag ggc ttt get ggc tec aga aag ggt gtc ctg ggc egg gtc tat 579 
Ser Glu Gly Phe Ala Gly Ser Arg Lys Gly Val Leu Gly Arg Val Tyr 
-30 -25 -20 

20 gag aca gtg gtg atg ttg atg etc etc act ctg ctg gtg eta ggt atg 627 
Glu Thr Val Val Met Leu Met Leu Leu Thr Leu Leu Val Leu Gly Met 
-15 -10 -5 1 

gtg tgg gtg gca tea gee att gtg gac aag aac aag gee aac aga gag 675 
Val Trp Val Ala Ser Ala lie Val Asp Lys Asn Lys Ala Asn Arg Glu 

25 5 10 15 

tea etc tat gac ttt tgg gag tac tat etc ccc tac etc tac tea tgc 723 
Ser Leu Tyr Asp Phe Trp Glu Tyr Tyr Leu Pro Tyr Leu Tyr Ser Cys 

20 25 30 

ate tec ttc ctt ggg gtt ctg ctg etc ctg gtg tgt act cca ctg ggt 771 

30 lie Ser Phe Leu Gly Val Leu Leu Leu Leu Val Cys Thr Pro Leu Gly 
35 40 45 

etc gee cgc atg ttc tec gtc act ggg aag ctg eta gtc aag ccc egg 819 
Leu Ala Arg Met Phe Ser Val Thr Gly Lys Leu Leu Val Lys Pro Arg 
50 55 60 65 

35 ctg ctg gaa gac ctg gag gag cag ctg tac tgc tea gee ttt gag gag 867 
Leu Leu Glu Asp Leu Glu Glu Gin Leu Tyr Cys Ser Ala Phe Glu Glu 

70 75 80 

gca gee ctg acc cgc agg ate tgt aat cct act tec tgc tgg ctg cct 915 
Ala Ala Leu Thr Arg Arg lie Cys Asn Pro Thr Ser Cys Trp Leu Pro 

40 85 ~ 90 95 

tta gac atg gag ctg eta cac aga cag gtc ctg get ctg cag aca cag 963 
Leu Asp Met Glu Leu Leu His Arg Gin Val Leu Ala Leu Gin Thr Gin 
100 105 110 

a 99 9 tc ct 9 ct 9 9 a 9 aa 9 a 99 c 99 aa 9 9 ct tca 9 CC t 99 caa c 99 aac 1011 

45 Arg Val Leu Leu Glu Lys Arg Arg Lys Ala Ser Ala Trp Gin Arg Asn 
115 120 125 

ctg ggc tac ccc ctg get atg ctg tgc ttg ctg gtg ctg acg ggc ctg 1059 
Leu Gly Tyr Pro Leu Ala Met Leu Cys Leu Leu Val Leu Thr Gly Leu 
130 135 140 145 

50 tct gtg etc att gtg gee ate cac ate ctg gag ctg etc ate gat gag 1107 
Ser Val Leu lie Val Ala lie His lie Leu Glu Leu Leu lie Asp Glu 

150 155 160 

get gee atg ccc cga ggc atg cag ggt acc tec tta ggc cag gtc tec 1155 
Ala Ala Met Pro Arg Gly Met Gin Gly Thr Ser Leu Gly Gin Val Ser 

55 165 170 175 

ttc tec aag ctg ggc tec ttt ggt gee gtc att cag gtt gta etc ate 1203 
Phe Ser Lys Leu Gly Ser Phe Gly Ala Val lie Gin Val Val Leu lie 

180 185 190 

ttt tac eta atg gtg tec tca gtt gtg ggc ttc tat age tct cca etc 1251 

60 Phe Tyr Leu Met Val Ser Ser Val Val Gly Phe Tyr Ser Ser Pro Leu 
195 200 205 

ttc egg age ctg egg ccc aga tgg cac gac act gee atg acg cag ata 1299 
Phe Arg Ser Leu Arg Pro Arg Trp His Asp Thr Ala Met Thr Gin lie 

45 
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210 215 220 225 

att ggg aac tgt gtc tgt etc ctg gtc eta age tea gca ctt cct gtc 
lie Gly Asn Cys Val Cys Leu Leu Val Leu Ser Ser Ala Leu Pro Val 
230 235 240 

5 ttc tct cga ace ctg ggg etc act cgc ttt gac ctg ctg ggt gac ttt 
Phe Ser Arg Thr Leu Gly Leu Thr Arg Phe Asp Leu Leu Gly Asp Phe 

245 250 255 . 

gga cgc ttc aac tgg ctg ggc aat ttc tac att gtg ttc etc tac aac 
Gly Arg Phe Asn Trp Leu Gly Asn Phe Tyr lie Val Phe Leu Tyr Asn 
10 ' 260 265 270 

gca gec ttt gca ggc etc ace aca etc tat ctg gtg aag ace ttc act 
Ala Ala Phe Ala Gly Leu Thr Thr Leu Tyr Leu Val Lys Thr Phe Thr 

275 280 285 

gca get gtg egg gca gag ctg ate egg gee ttt ggg ctg gac aga ctg 
15 Ala Ala Val Arg Ala Glu Leu He Arg Ala Phe Gly Leu Asp Arg Leu 
290 295 300 305 

ccg ctg ccc gtc tec ggt ttc ccc cag gca tct agg aag ace cag cac 
Pro Leu Pro Val Ser Gly Phe Pro Gin Ala Ser Arg Lys Thr Gin His 
310 315 320 

20 cag tgacctccag ctgggggtgg gaagaaaaaa actggacact gccatctgct 
Gin 

gcctaggcct ggagggaagc ccaaggctac ttggacctca ggacctggaa tctgagaggg 
tgggtggcag aggggagcag agccatctgc actattgeat aatctgagee agagtttggg 
accaggacct cctgcttttc catacttaac tgtggcctca gcatggggta gggctgggtg 

25 actgggtcta gcccctgatc ccaaatctgt ttacacatca atctgcctca ctgctgttct 
gggccatccc catagecatg tttacatgat ttgatgtgca atagggtggg gtaggggcag 
ggaaaggact gggecaggge aggcteggga gatagattgt ctcccttgcc tctggcccag 
cagagectaa gcactgtgct atcctggagg ggctttggac cacctgaaag accaagggga 
tagggaggag gaggcttcag ccatcagcaa taaagttgat cccaggcaaa aaaaaaaaaa 

30 aaaa 



1347 



1395 



1443 



1491 



1539 



1587 



1640 

1700 
1760 
1820 
1880 
1940 
2000 
2060 
2120 
2124 



<210> 40 

<211> 1159 

<212> DNA 

35 <213> Homo sapiens 



40 



45 



<220> 
<221> CDS 
<222> 84 . .326 

<220> 

<221> sig_peptide 

<222> 84 . . 146 

<223> Von Heijne matrix 

score 6.39000252120129 
seq LGLSVLLTAATVA/GV 



<400> 40 

agtacaggcg gcggtgcgca ctctgcggcg gcctctgcgc ctegggeggg egggagagag 
50 aggccgcggc cgccagcgtg ggg atg tct agg age teg aag gtg gtg ctg ggc 

Met Ser Arg Ser Ser Lys Val Val Leu Gly 
-20 -15 
etc teg gtg ctg ctg acg gcg gee aca gtg gee ggc gta cat gtg aag 
Leu Ser Val Leu Leu Thr Ala Ala Thr Val Ala Gly Val His Val Lys 
55 -10 -5 15 

cag cag tgg gac cag cag agg ctt cgt gac gga gtt ate aga gac att 
Gin Gin Trp Asp Gin Gin Arg Leu Arg Asp Gly Val He Arg Asp He 

10 15 20 

gag agg caa att egg aaa aaa gaa aac att cgt ctt ttg gga gaa cag 
60 Glu Arg Gin He Arg Lys Lys Glu Asn He Arg Leu Leu Gly Glu Gin 
25 30 35 

att att ttg act gag caa ctt gaa gca gaa aga gag aag atg tta ttg 
He He Leu Thr Glu Gin Leu Glu Ala Glu Arg Glu Lys Met Leu Leu 

46 



60 
113 



161 



209 



257 



305 
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40 45 50 

gca aaa gga tct caa aaa tea tgacttgaat gtgaaatatc tgttggacag 356 
Ala Lys Gly Ser Gin Lys Ser 
55 60 

5 acaacacgag tttgtgtgtg tgtgttgatg gagagtagct tagtagtatc ttcatctttt 416 

tttttggtca ctgtcctttt aaacttgatc aaataaagga cagtgggtca tataagttac 476 

tgctttcagg gtcccttata tctgaataaa ggagtgtggg cagacacttt ttggaagagt 536 

ctgtctgggt gatcctggta gaagccccat tagggtcact gtccagtgct tagggttgtt 596 

actgagaagc actgccgagc ttgtgagaag gaagggatgg atagtagcat ccacctgagt 656 

10 agtctgatca gtcggcatga tgacgaagee acgagaacat cgacctcaga aggactggag 716 

gaaggtgaaa gtggagggag agacgctcct gategtcgaa tycegaggat caggkeatea 776 

gtggacttat cgcacgacca gagtggggat tccctcaaca gtgatgaagg agacgtgtct 836 

tggatggagg agcagctgtc ctacttctgt gaeaagtgee aaaaatggat accagccagt 896 

aaggagcttc tcaattcctt tgatttgtca attcctgtgt gaaggtttgt ttttccaacc 956 

15 tgtgaaagaa acgtgaatgt aaaagagacc taaataaaag gataattata tttattctct 1016 

agttgatcag ctataaattt atataaaaca taggcatgtt tgtactaatg aaacgtactg 1076 

tcaacctcta tcacattgtt aaattaacac ttttggtggt aactcaataa aattgagaaa 1136 

attgeaaaaa aaaaaaaaaa aaa 1159 



20 <210> 41 

<211> 1953 

<212> DNA 

<213> Homo sapiens 

25 <220> 

<221> CDS 
<222> 56. .1678 

<220> 

30 <221> sig_peptide 
<222> 56 . . 139 
<223> Von Heijne matrix 

score 3.75144398608723 

seq SGLLLQVLFRLIT/FV 

35 

<400> 41 

agactaaacg gagtggcggc ggcatttcct ggtgtctgag cctggcgcgg aggct atg 58 

Met 

gec tec tec ggt 106 
Ala Ser Ser Gly 
-15 

gtc ttg aat gca 154 
Val Leu Asn Ala 
5 

gta gta aat gta 202 
Val Val Asn Val 
20 

gee aga gag gec 250 
Ala Arg Glu Ala 
35 

gac tgg age cag 298 
Asp Trp Ser Gin 
50 

gtg ttt tgg tec 346 
Val Phe Trp Ser 

gtg cct gat cct 394 
Val Pro Asp Pro 
85 

ttt ggt etc teg 442 
Phe Gly Leu Ser 
100 

ttg gca caa gca 490 

47 



ggc age cag gag gtg ctg ggc cac gcg gec egg ctg 
40 Gly Ser Gin Glu Val Leu Gly His Ala Ala Arg Leu 
-25 -20 
etc etc ctg cag gtg ttg ttt egg ttg ate ace ttt 
Leu Leu Leu Gin Val Leu Phe Arg Leu lie Thr Phe 
-10 -5 1 

45 ttt att ctt cgc ttc ctg tea aag gaa ate gtt ggc 
Phe lie Leu Arg Phe Leu Ser Lys Glu lie Val Gly 

10 15 
aga eta acg ctg ctt tac tea acc ace etc ttc ctg 
Arg Leu Thr Leu Leu Tyr Ser Thr Thr Leu Phe Leu 
50 25 3 0 

ttc cgc aga gca tgt etc agt ggg ggc acc cag cga 
Phe Arg Arg Ala Cys Leu Ser Gly Gly Thr Gin Arg 

40 45 
acc etc aac ctg ctg tgg eta aca gtc ccc ctg ggt 
55 Thr Leu Asn Leu Leu Trp Leu Thr Val Pro Leu Gly 
55 60 65 

tta ttc ctg ggc tgg ate tgg ttg cag ctg ctt gaa 
Leu Phe Leu Gly Trp lie Trp Leu Gin Leu Leu Glu 
70 75 80 

60 aat gtt gtc cct cac tat gca act gga gtg gtg ctg 
Asn Val Val Pro His Tyr Ala Thr Gly Val Val Leu 

90 95 
gca gtg gtg gag ctt eta gga gag ccc ttt tgg gtc 
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Ala 


Val 


Val 


Glu 


Leu 


Leu 


Gly 


Glu 


Pro 


Phe 


Trp 


val 


Leu 


Ala 


Gin 


Ala 












105 










110 










115 










cat 


atg 


ttt 


gtg 


aag 


etc 


aag 


gtg 


att 


gca 


gag 


age 


ctg 


teg 


gta 


att 


538 




His 


Met 


Phe 


Val 


Lys 


Leu 


Lys 


Val 


He 


Ala 


Glu 


Ser 


Leu 


Ser 


Val 


He 




5 






120 










125 










130 












ctt 


aag 


age 


gtt 


ctg 


aca 


get 


ttt 


etc 


gtg 


ctg 


tgg 


ttg 


cct 


cac 


tgg 


586 




Leu 


Lys 


Ser 


Val 


Leu 


Thr 


Ala 


Phe 


Leu 


Val 


Leu 


Trp 


Leu 


Pro 


His 


Trp 








135 










140 










145 














gga 


ttg 


tac 


att 


ttc 


tct 


ttg 


gee 


cag 


ctt 


ttc 


tat 


acc 


aca 


gtt 


ctg 


634 


10 


Gly 


Leu 


Xyr 


He 


Phe 


Ser 


Leu 


Ala 


Gin 


Leu 


Phe 


Tyr 


Thr 


Thr 


Val 


Leu 






150 










155 










160 










165 






gtg 


etc 


tgc 


tat 


gtt 


att 


tat 


ttc 


aca 


aag 


tta 


ctg 


ggt 


tec 


cca 


gaa 


682 




Val 


Leu 


Cys 


Tyr 


Val 


He 


Tyr 


Phe 


Thr 


Lys 


Leu 


Leu 


Gly 


Ser 


Pro 


Glu 














170 










175 










180 






15 


tea 


acc 


aag 


ctt 


caa 


act 


ctt 


cct 


gtc 


tec 


aga 


ata 


aca 


gat 


ctg 


tta 


730 




Ser 


Thr 


Lys 


Leu 


Gin 


Thr 


Leu 


Pro 


Val 


Ser 


Arg 


He 


Thr 


Asp 


Leu 


Leu 












185 










190 










195 










ccc 


aat 


att 


aca 


aga 


aat 


gga 


gcg 


ttt 


ata 


aac 


tgg 


aaa 


gag 


get 


aaa 


778 




Pro 


Asn 


He 


Thr 


Arg 


Asn 


Gly 


Ala 


Phe 


He 


Asn 


Trp 


Lys 


Glu 


Ala 


Lys 




20 






200 










205 










210 












ctg 


act 


tgg 


agt 


ttt 


ttc 


aaa 


cag 


tct 


ttc 


ttg 


aaa 


cag 


att 


ttg 


aca 


826 




Leu 


Thr 


Trp 


Ser 


Phe 


Phe 


Lys 


Gin 


Ser 


Phe 


Leu 


Lys 


Gin 


He 


Leu 


Thr 








215 










220 










225 














gaa 


ggc 


gag 


cga 


tat 


gtg 


atg 


aca 


ttt 


ttg 


aat 


gta 


ttg 


aac 


ttt 


ggt 


874 


25 


Glu 


Gly 


Glu 


Arg 


Tyr 


Val 


Met 


Thr 


Phe 


Leu 


Asn 


Val 


Leu 


Asn 


Phe 


Gly 






230 










235 










240 










245 






gat 


cag 


ggt 


gtg 


tat 


gat 


ata 


gtg 


aat 


aat 


ctt 


ggc 


tec 


ctt 


gtg 


gec 


922 




Asp 


Gin 


Gly 


Val 


Tyr 


Asp 


He 


Val 


Asn 


Asn 


Leu 


Gly 


Ser 


Leu 


Val 


Ala 














250 










255 










260 






30 


aga 


tta 


att 


ttc 


cag 


cca 


ata 


gag 


gaa 


agt 


ttt 


tat 


ata 


ttt 


ttt 


get 


970 




Arg 


Leu 


He 


Phe 


Gin 


Pro 


He 


Glu 


Glu 


Ser 


Phe 


Tyr 


He 


Phe 


Phe 


Ala 












265 










270 










275 










aag 


gtg 


ctg 


gag 


agg 


gga 


aag 


gat 


gee 


aca 


ctt 


cag 


aag 


cag 


gag 


gac 


1018 




Lys 


Val 


Leu 


Glu 


Arg 


Gly 


Lys 


Asp 


Ala 


Thr 


Leu 


Gin 


Lys 


Gin 


Glu 


Asp 




35 






280 










285 










290 












gtt 


get 


gtg 


get 


get 


gca 


gtc 


ttg 


gag 


tec 


ctg 


etc 


aag 


ctg 


gec 


ctg 


1066 




Val 


Ala 


Val 


Ala 


Ala 


Ala 


val 


Leu 


Glu 


Ser 


Leu 


Leu 


Lys 


Leu 


Ala 


Leu 








295 










300 










305 














ctg 


gee 


ggc 


ctg 


acc 


ate 


act 


gtt 


ttt 


ggc 


ttt 


gee 


tat 


tct 


cag 


ctg 


1114 


40 


Leu 


Ala 


Gly 


Leu 


Thr 


He 


Thr 


Val 


Phe 


Gly 


Phe 


Ala 


Tyr 


Ser 


Gin 


Leu 






310 










315 










320 










325 






get 


ctg 


gat 


ate 


aac 


gga 


ggg 


acc 


atg 


ctt 


age 


tea 


gga 


tec 


ggt 


cct 


1162 




Ala 


Leu 


Asp 


He 


Asn 


Gly 


Gly 


Thr 


Met 


Leu 


Ser 


Ser 


Gly 


Ser 


Gly 


Pro 














330 










335 










340 






45 


gtt 


ttg 


ctg 


cgt 


tec 


tac 


tgt 


etc 


tat 


gtt 


etc 


ctg 


ctt 


gee 


ate 


aat 


1210 




Val 


Leu 


Leu 


Arg 


Ser 


Tyr 


Cys 


Leu 


Tyr 


Val 


Leu 


Leu 


Leu 


Ala 


He 


Asn 












345 










350 










355 










gga 


gtg 


aca 


gag 


tgt 


ttc 


aca 


ttt 


get 


gee 


atg 


age 


aaa 


gag 


gag 


gtc 


1258 




Gly 


Val 


Thr 


Glu 


Cys 


Phe 


Thr 


Phe 


Ala 


Ala 


Met 


Ser 


Lys 


Glu 


Glu 


Val 




50 






360 










365 










370 












gac 


agg 


tac 


aat 


ttt 


gtg 


atg 


ctg 


gee 


ctg 


tec 


tec 


tea 


ttc 


ctg 


gtg 


1306 




Asp 


Arg 


Tyr 


Asn 


Phe 


Val 


Met 


Leu 


Ala 


Leu 


Ser 


Ser 


Ser 


Phe 


Leu 


Val 








375 










380 










385 














tta 


tec 


tat 


etc 


ttg 


acc 


cgt 


tgg 


tgt 


ggc 


age 


gtg 


ggc 


ttc 


ate 


ttg 


1354 


55 


Leu 


Ser 


Tyr 


Leu 


Leu 


Thr 


Arg 


Trp 


Cys 


Gly 


Ser 


Val 


Gly 


Phe 


He 


Leu 






390 










395 










400 










405 






gec 


aac 


tgc 


ttt 


aac 


atg 


ggc 


att 


egg 


ate 


acg 


cag 


age 


ctt 


tgc 


ttc 


1402 




Ala 


Asn 


Cys 


Phe 


Asn 


Met 


Gly 


He 


Arg 


He 


Thr 


Gin 


Ser 


Leu 


Cys 


Phe 














410 










415 










420 






60 


ate 


cac 


cgc 


tac 


tac 


cga 


agg 


age 


ccc 


cac 


agg 


ccc 


ctg 


get 


ggc 


ctg 


1450 




He 


His 


Arg 


Tyr 


Tyr 


Arg 


Arg 


Ser 


Pro 


His 


Arg 


Pro 


Leu 


Ala 


Gly 


Leu 












425 










430 










435 










cac 


eta 


teg 


cca 


gtc 


ctg 


etc 


ggg 


aca 


ttt 


gee 


etc 


agt 


ggt 


ggg 


gtt 


1498 



48 
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10 



15 



20 



25 



His Leu Ser Pro Val Leu Leu Gly Thr Phe Ala 

440 445 
act get gtt teg gag gta ttc etc tgc tgt gag 
Thr Ala Val Ser Glu Val Phe Leu Cys Cys Glu 

455 460 
aga ctg gca cac att get gtg ggg gee ttc tgt 
Arg Leu Ala His lie Ala Val Gly Ala Phe Cys 
470 475 480 

ggg aca gca ttc etc aca gag acc aag ctg ate 
Gly Thr Ala Phe Leu Thr Glu Thr Lys Leu lie 

490 495 
cag tta ggt gtg ccc aga cgc act gac aaa atg 
Gin Leu Gly Val Pro Arg Arg Thr Asp Lys Met 

505 510 
gaagcctgga cacccgaggc acctggacca gctatgggta 
tctgtgtaag agccccactg agggctctgc agcggagtga 
caccagagag tgccactgca tgagacacct gtgaccattc 
ggagtttcat ttttaagtga agaccaaaag ccctttaaaa 
tatagtgaaa aaaaaaaaaa aaaaa 

<210> 42 

<211> 1688 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 119 . . 1522 



Leu 

cag 
Gin 
465 
ctg 
Leu 

cat 
His 

acg 
Thr 



Ser Gly Gly Val 
450 

99C tgg cca gee 
Gly Trp Pro Ala 

gga gca act etc 
Gly Ala Thr Leu 
485 

ttc etc agg act 
Phe Leu Arg Thr 

500 
tgacttcagg 



gttctgtggg tggaacacat 
cagcaacccc agagatgagg 
gaagtctgaa atgcgggggg 
ataatagttt tttatcattt 



1546 



1594 



1642 



1688 



1748 
1808 
1868 
1928 
1953 



30 <220> 

<221> sig^ peptide 

<222> 119. . 181 

<223> Von Heijne matrix 

score 11.6921972463885 
35 seq LLLCLALSGAAET/KP 



<400> 42 

aaaaggctgc aggctgecag gtgtgcttgg agagccccct tcttccgccg ggcctcgcaa 60 
geagegtagg actgtggaga agggcggtgg gcaaggaggg aactcgagag cagcctcc 118 



atg 


ggc 


aca 


cag 


gag 


ggc 


tgg 


tgc 


ctg 


ctg 


etc 


tgc 


ctg 


get 


eta 


tct 


166 


Met 


Gly 
-20 


Thr 


Gin 


Glu 


Gly 


Trp 
-15 


Cys 


Leu 


Leu 


Leu 


Cys 
-10 


Leu 


Ala 


Leu 


Ser 




gga 


gca 


gca 


gaa 


acc 


aag 


ccc 


cac 


cca 


gca 


gag 


ggg 


cag 


tgg 


egg 


gca 


214 


Gly Ala Ala 


Glu 


Thr 


Lys 


Pro 


His 


Pro 


Ala 


Glu 


Gly 


Gin 


Trp 


Arg 


Ala 




-5 










1 








5 










10 






gtg 


gac 


gtg 


gtc 


eta 


gac 


tgt 


ttc 


ctg 


gtg 


aag 


gac 


ggt 


gcg 


cac 


cgt 


262 


Val 


Asp 


Val 


Val 


Leu 


Asp 


Cys 


Phe 


Leu 


Val 


Lys 


Asp 


Gly Ala 


His 


Arg 










15 










20 










25 








gga 


get 


etc 


gee 


age 


agt 


gag 


gac 


agg 


gca 


agg 


gee 


tec 


ctt 


gtg 


ctg 


310 


Gly Ala 


Leu 


Ala 


Ser 


Ser 


Glu 


Asp 


Arg 


Ala 


Arg 


Ala 


Ser 


Leu 


Val 


Leu 








30 










35 










40 










aag 


cag 


gtg 


cca 


gtg 


ctg 


gac 


gat 


ggc 


tec 


ctg 


gag 


gac 


ttc 


acc 


gat 


358 


Lys 


Gin 
45 


Val 


Pro 


Val 


Leu 


Asp 
50 


Asp 


Gly 


Ser 


Leu 


Glu 
55 


Asp 


Phe 


Thr 


Asp 




ttc 


caa 


ggg 


ggc 


aca 


ctg 


gec 


caa 


gat 


gac 


cca 


cct 


att 


ate 


ttt 


gag 


406 


Phe 


Gin 


Gly 


Gly 


Thr 


Leu 


Ala 


Gin 


Asp 


Asp 


Pro 


Pro 


He 


He 


Phe 


Glu 




60 










65 










70 










75 




gee 


tea 


gtg 


gac 


ctg 


gtc 


cag 


att 


ccc 


cag 


gee 


gag 


gee 


ttg 


etc 


cat 


454 


Ala 


Ser 


Val 


Asp 


Leu 
80 


Val 


Gin 


He 


Pro 


Gin 
85 


Ala 


Glu 


Ala 


Leu 


Leu 
90 


His 




get 


gac 


tgc 


agt 


ggg 


aag 


gag 


gtg 


acc 


tgt 


gag 


ate 


tec 


cgc 


tac 


ttt 


502 


Ala 


Asp 


Cys 


Ser 


Gly 


Lys 


Glu 


Val 


Thr 


Cys 


Glu 


He 


Ser 


Arg 


Tyr 


Phe 





95 100 105 

49 
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etc cag atg aca gag 
Leu Gin Met Thr Glu 
110 

aac gtg cag gtc tct 
5 Asn Val Gin Val Ser 
125 

act ccc agg gtc gec 
Thr Pro Arg Val Ala 
140 

10 ttg cca ctg age ccc 
Leu Pro Leu Ser Pro 
160 

gtg atg aca cag acc 
Val Met Thr Gin Thr 
15 175 

tec ttg gac tgt ggc 
Ser Leu Asp Cys Gly 
190 

gtg gag tgg cga ctg 
20 Val Glu Trp Arg Leu 
205 

tgg acc gca ggg cag 
Trp Thr Ala Gly Gin 
220 

25 cct gca caa ctg ggc 
Pro Ala Gin Leu Gly 
240 

etc act ata cag gac 
Leu Thr lie Gin Asp 
30 255 

ctg tac cga get cag 
Leu Tyr Arg Ala Gin 
270 

aaa gta cga ctg age 
35 Lys Val Arg Leu Ser 
285 

tgc gac att get ggc 
Cys Asp lie Ala Gly 
300 

40 cga gag gag ctg ggt 
Arg Glu Glu Leu Gly 
320 

tec age etc agg caa 
Ser Ser Leu Arg Gin 
45 335 

etc acc gca gaa cct 
Leu Thr Ala Glu Pro 
350 

aca cac ate tct ctg 
50 Thr His lie Ser Leu 
365 

cca cca gag egg aga 
Pro Pro Glu Arg Arg 
380 

55 ttc ctt ctt gca ctg 
Phe Leu Leu Ala Leu 
400 

aca gga ctt ggg ctg 
Thr Gly Leu Gly Leu 
60 415 

get gac aca cag age 
Ala Asp Thr Gin Ser 
430 



0142451 A2 I > 



acc 


act 


gtt 


aag 


aca 


gca 


get 


Thr 


Thr 


Val 


Lys 


Thr 


Ala 


Ala 






115 










gga 


ggg 


gga 


cct 


age 


ate 


tec 


Gly 


Gly 


Gly 


Pro 


Ser 


He 


Ser 




130 










135 


aag 


aat 


gag 


gtg 


etc 


tgg 


cac 


Lys 


Asn 


Glu 


Val 


Leu 


Trp 


His 


145 










150 




cag 


ggg 


act 


gtg 


cga 


act 


gca 


Gin 


Gly 


Thr 


Val 


Arg 


Thr 


Ala 










165 






caa 


tec 


ctg 


age 


ttc 


ctg 


ctg 


Gin 


Ser 


Leu 


Ser 


Phe 


Leu 


Leu 








180 








ttc 


tec 


atg 


gca 


ccg 


ggc 


ttg 


Phe 


Ser 


Met 


Ala 


Pro 


Gly 


Leu 






195 










cag 


cac 


aag 


ggc 


agg 


ggt 


cag 


Gin 


His 


Lys 


Gly 


Arg 


Gly 


Gin 




210 










215 


ggg 


cag 


get 


gtg 


egg 


aag 


ggc 


Gly 


Gin 


Ala 


Val 


Arg 


Lys 


Gly 


225 










230 




atg 


gec 


agg 


gat 


gee 


tec 


etc 


Met 


Ala 


Arg 


Asp 


Ala 


Ser 


Leu 










245 






gag 


ggg 


acc 


tac 


att 


tgc 


cag 


Glu 


Gly 


Thr 


Tyr 


He 


Cys 


Gin 








260 








cag 


ate 


ate 


cag 


etc 


aac 


ate 


Gin 


lie 


He 


Gin 


Leu 


Asn 


He 






275 










ttg 


gca 


aac 


gaa 


get 


ctg 


ctg 


Leu 


Ala 


Asn 


Glu 


Ala 


Leu 


Leu 




290 










295 


tat 


tac 


cct 


ctg 


gat 


gtg 


gtg 


Tyr 


Tyr 


Pro 


Leu 


Asp 


Val 


Val 


305 










310 




gga 


tec 


cca 


gee 


caa 


gtc 


tct 


Gly 


Ser 


Pro 


Ala 


Gin 


Val 


Ser 










325 






age 


gtg 


gca 


ggc 


acc 


tac 


age 


Ser 


Val 


Ala 


Gly 


Thr 


Tyr 


Ser 








340 








ggc 


tct 


gca 


ggt 


gee 


act 


tac 


Gly 


Ser 


Ala 


Gly 


Ala 


Thr 


Tyr 






355 










gag 


gag 


ccc 


ctt 


ggg 


gee 


age 


Glu 


Glu 


Pro 


Leu 


Gly 


Ala 


Ser 




370 










375 


aca 


gee 


ttg 


gga 


gtc 


ate 


ttt 


Thr 


Ala 


Leu 


Gly 


Val 


He 


Phe 


385 










390 




atg 


ttc 


ctg 


ggg 


ctt 


cag 


aga 


Met 


Phe 


Leu 


Gly 


Leu 


Gin 


Arg 










405 






ctt 


cag 


get 


gaa 


cgc 


tgg 


gag 


Leu 


Gin 


Ala 


Glu 


Arg 


Trp 


Glu 








420 








tec 


cat 


etc 


cat 


gaa 


gac 


cgc 


Ser 


His 


Leu 


His 


Glu 


Asp 


Arg 



435 



50 



PCT/IB00/01938 

tgg ttc atg gee 550 

Trp Phe Met Ala 

120 

ttg gtg atg aag 598 
Leu Val Met Lys 

cca acg ctg aac 646 
Pro Thr Leu Asn 
155 

gtg gag ttc cag 694 
Val Glu Phe Gin 
170 

ggg tec tea gee 742 
Gly Ser Ser Ala 
185 

gac etc ate agt 790 

Asp Leu He Ser 

200 

ttg gtg tac age 838 
Leu Val Tyr Ser 

get acc ctg gag 886 
Ala Thr Leu Glu 
235 

acc ctg ccc ggc 934 
Thr Leu Pro Gly 
250 

ate acc acc tct 982 
He Thr Thr Ser 
265 

caa get tec cct 1030 

Gin Ala Ser Pro 

280 

ccc acc etc ate 1078 
Pro Thr Leu He 

gtg acg tgg acc 1126 

Val Thr Trp Thr 
315 

ggt gee tec ttc 1174 

Gly Ala Ser Phe 
330 

ate tec tec tct 1222 

He Ser Ser Ser 
345 

acc tgc cag gtc 1270 

Thr Cys Gin Val 

360 

acc cag gtt gtc 1318 

Thr Gin Val Val 

gee age agt etc 1366 

Ala Ser Ser Leu 
395 

egg caa gca cct 1414 

Arg Gin Ala Pro 
410 

acc act tec tgt 1462 

Thr Thr Ser Cys 
425 

aca gcg cgt gta 1510 

Thr Ala Arg Val 

440 



WO 01/42451 



PCT/1B00/01938 



age cag ccc age tgacctaaag cgacatgaga ctactagaaa gaaacgacac 1562 
Ser Gin Pro Ser 
445 

ccttccccaa gcccccacag ctactccaac ccaaacaaca accaagccag tttaatggta 1622 

5 ggaatttgta ttttttgect ttgttcagaa tacatgacat tggtaaataa aaaaaaaaaa 1682 

aaaaaa 1688 

<210> 43 
<211> 1942 
10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 334 . . 1551 

<220> 

<221> sig_peptide 
<222> 334 . .426 
20 <223> Von Heijne matrix 

score 4.0554926521937 

seq TVFLLVTLQALDT/VE 

<400> 43 

25 gctcataggg agaaaggaag ctgcgctgcg ttctgcggga cgaaccctgc tccgcgcgag 60 
aatttttttg attccttctt atttggagaa atctccagct gctctgatca tagectaaga 120 
agactgeatg ctgcttcctc tegatgecaa gccagaccct ctcacaacct eggatctcag 180 
tccttcatgg agacctggtc ccagcaggaa tggcagtgca ggaaattggc geccagatgg 240 
ttcttccatg tgaagttgtc tegggctctg ggctgacgag agaacacctg gtaaccaggt 300 
30 tagccctctg tcagtcaccc agggcagggc age atg gtg egg att cag agg agg 354 

Met Val Arg lie Gin Arg Arg 
-30 -25 
aag ctt ttg gca tct tgc ctg tgc gtc aca gee ace gtc ttt ctg ctt 402 
Lys Leu Leu Ala Ser Cys Leu Cys Val Thr Ala Thr Val Phe Leu Leu 
35 -20 -15 -10 

gtc aca etc cag gee ttg gat acc gtt gag aat eta atg aaa gtc acg 450 
Val Thr Leu Gin Ala Leu Asp Thr Val Glu Asn Leu Met Lys Val Thr 

-5 15 
ggc cct ccc cag gga gtt aca gac tec atg caa tgc ttc aat gat cag 498 
40 Gly Pro Pro Gin Gly Val Thr Asp Ser Met Gin Cys Phe Asn Asp Gin 
10 15 20 

tgg cct tta tct aac acc agg age age gag cac ata aaa gag gtc atg 546 
Trp Pro Leu Ser Asn Thr Arg Ser Ser Glu His lie Lys Glu Val Met 
25 30 35 40 

45 gtt gag ctg ggg aag ttt gaa agg aag gag ttt aaa agt tec agt ttg 594 
Val Glu Leu Gly Lys Phe Glu Arg Lys Glu Phe Lys Ser Ser Ser Leu 

45 50 55 

caa gat gga cat aca aaa atg gag gaa gca cct acg cat ctt aat tea 642 
Gin Asp Gly His Thr Lys Met Glu Glu Ala Pro Thr His Leu Asn Ser 
50 60 65 70 

ttt ctt aag aaa gaa gga ttg acc ttc aac agg aaa aga aaa tgg gaa 690 
Phe Leu Lys Lys Glu Gly Leu Thr Phe Asn Arg Lys Arg Lys Trp Glu 

75 80 85 

ttg gac age tac ccc att atg etc tgg tgg tec ccg ctg acg ggg gag 73 8 

55 Leu Asp Ser Tyr Pro lie Met Leu Trp Trp Ser Pro Leu Thr Gly Glu 
90 95 100 

act ggg agg tta ggc caa tgt gga gca gat get tgt ttc ttc acc ate 786 
Thr Gly Arg Leu Gly Gin Cys Gly Ala Asp Ala Cys Phe Phe Thr lie 
105 110 115 120 

60 aac egg acc tac etc cat cat cac atg acc aaa gca ttc etc ttc tat 834 
Asn Arg Thr Tyr Leu His His His Met Thr Lys Ala Phe Leu Phe Tyr 

125 130 135 

ggt act gac ttt aac ata gat age tta cct ctg cct egg aaa gee cat 882 

51 
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Gly Thr Asp Phe Asn He Asp Ser Leu Pro Leu Pro Arg Lys Ala His 

140 145 150 

cat gac tgg get gtt ttt cat gaa gag tec ccg aaa aac aat tat aag 930 
His Asp Trp Ala Val Phe His Glu Glu Ser Pro Lys Asn Asn Tyr Lys 
5 155 160 165 

etc ttt cat aaa cca gtg ate acc ttg ttc aac tac act gec acg ttc 978 
Leu Phe His Lys Pro Val He Thr Leu Phe Asn Tyr Thr Ala Thr Phe 

170 175 180 

age agg cat tec cac ttg cca eta act acc caa tac ttg gag age att 1026 

10 Ser Arg His Ser His Leu Pro Leu Thr Thr Gin Tyr Leu Glu Ser He 
185 190 195 200 

gaa gtc ctg aag tea etc cga tac eta gtt cct ttg cag tec aaa aac 1074 
Glu Val Leu Lys Ser Leu Arg Tyr Leu Val Pro Leu Gin Ser Lys Asn 
205 210 215 

15 aag ctt aga aaa aga ctt get ccg ctg gtg tat gta cag tea tac tgt 1122 
Lys Leu Arg Lys Arg Leu Ala Pro Leu Val Tyr Val Gin Ser Tyr Cys 

220 225 230 

gac cca cca tea gac agg gac age tat gtt cgc gag ctg atg act tac 1170 
Asp Pro Pro Ser Asp Arg Asp Ser Tyr Val Arg Glu Leu Met Thr Tyr 

20 235 240 245 

ate gag gtc gat tec tat ggt gaa tgt tta cga aac aaa gac etc cct 1218 
He Glu Val Asp Ser Tyr Gly Glu Cys Leu Arg Asn Lys Asp Leu Pro 

250 255 260 

cag cag ctg aaa aat cca gee tct atg gat gec gat ggc ttt tat agg 1266 

25 Gin Gin Leu Lys Asn Pro Ala Ser Met Asp Ala Asp Gly Phe Tyr Arg 
265 270 275 280 

ate att gca cag tat aag ttt ate eta get ttt gag aat gca gtt tgt 1314 
He He Ala Gin Tyr Lys Phe He Leu Ala Phe Glu Asn Ala Val Cys 
285 290 295 

30 gat gac tac ate act gag aag ttc tgg agg cca ctg aaa ctg ggg gta 1362 
Asp Asp Tyr He Thr Glu Lys Phe Trp Arg Pro Leu Lys Leu Gly Val 

300 305 310 

gtc cct gta tat tac gga tec ccc age ate aca gac tgg ctt cca agt 1410 
Val Pro Val Tyr Tyr Gly Ser Pro Ser He Thr Asp Trp Leu Pro Ser 

35 315 32 0 32 5 

aac aaa agt get att ctt gta tea gaa ttt tct cac ccc agg gaa ctg 1458 
Asn Lys Ser Ala He Leu Val Ser Glu Phe Ser His Pro Arg Glu Leu 

330 335 340 

gca agt tac ate aga cga ctg gat tct gat gac aga ttg tat gag gee 1506 

40 Ala Ser Tyr He Arg Arg Leu Asp Ser Asp Asp Arg Leu Tyr Glu Ala 
345 350 355 360 

tat gta gaa tgg aag ctg aag ggt aga tct eta acc age gac ttc 1551 
Tyr Val Glu Trp Lys Leu Lys Gly Arg Ser Leu Thr Ser Asp Phe 
365 370 375 

45 tgacagctct cagggaaegg aaatggggag tgcaagacgt caaccaggac aattacatcg 1611 
atgcatttga gtgtatggtg tgcaccaagg tgtgggctaa tatcaggctt caggaaaagg 1671 
gcttaccacc caaaagatgg gaggcagaag atacccacct gagttgecca gagcccacag 1731 
tgtttgcttt ctcaccactc cggactccac ctttgagctc tttgegagag atgtggattt 1791 
ccagctttga acaatccaag aaagaagece aggcactaag gtggctggtt gataggaatc 1851 

50 aaaacttttc atctcaagag ttttggggcc tagtattcaa ggactgattt caaaaatgat 1911 
cagaatgaaa cagaaaaaaa aaaaaaaaaa a 1942 

<210> 44 
<211> 1657 
55 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
60 <222> 72 . . 986 

<220> 

<221> sig_peptide 

52 
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<222> 72 . . 149 

<223> Von Heijne matrix 

score 6.33091407142367 

seq GVGLVTLLGLAVG/SY 

5 

<400> 44 

ctccgacccg ccccgcggcg cattgtggga tctgtcggct tgtcaggtgg tggaggaaaa 60 
ggcgctccgt c atg ggg ate cag acg age ccc gtc ctg ctg gee tec ctg 110 
Met Gly lie Gin Thr Ser Pro Val Leu Leu Ala Ser Leu 
10 -25 -20 -15 

ggg gtg ggg ctg gtc act ctg etc ggc ctg get gtg ggc tec tac ttg 158 
Gly Val Gly Leu Val Thr Leu Leu Gly Leu Ala Val Gly Ser Tyr Leu 

-10 -5 1 

gtt egg agg tec cgc egg cct cag gtc act etc ctg gac ccc aat gaa 206 
15 Val Arg Arg Ser Arg Arg Pro Gin Val Thr Leu Leu Asp Pro Asn Glu 
5 10 15 

aag tac ctg eta cga ctg eta gac aag acg act gtg age cac aac ace 254 
Lys Tyr Leu Leu Arg Leu Leu Asp Lys Thr Thr Val Ser His Asn Thr 
20 25 30 35 

20 aag agg ttc cgc ttt gee ctg ccc ace gee cac cac act ctg ggg ctg 302 
Lys Arg Phe Arg Phe Ala Leu Pro Thr Ala His His Thr Leu Gly Leu 

40 45 50 

cct gtg ggc aaa cat ate tac etc tec ace cga att gat ggc age ctg 350 
Pro Val Gly Lys His lie Tyr Leu Ser Thr Arg lie Asp Gly Ser Leu 
25 * 55 60 65 

gtc ate agg cca tac act cct gtc ace agt gat gag gat caa ggc tat 398 
Val lie Arg Pro Tyr Thr Pro Val Thr Ser Asp Glu Asp Gin Gly Tyr 

70 75 80 

gtg gat ctt gtc ate aag gtc tac ctg aag ggt gtg cac ccc aaa ttt 446 
30 Val Asp Leu Val lie Lys Val Tyr Leu Lys Gly Val His Pro Lys Phe 
85 90 95 

cct gag gga ggg aag atg tct cag tac ctg gat age ctg aag gtt ggg 4 94 

Pro Glu Gly Gly Lys Met Ser Gin Tyr Leu Asp Ser Leu Lys Val Gly 
100 105 110 115 

35 gat gtg gtg gag ttt egg ggg cca age ggg ttg etc act tac act gga 542 
Asp Val Val Glu Phe Arg Gly Pro Ser Gly Leu Leu Thr Tyr Thr Gly 

120 125 130 

aaa ggg cat ttt aac att cag ccc aac aag aaa tct cca cca gaa ccc 590 
Lys Gly His Phe Asn lie Gin Pro Asn Lys Lys Ser Pro Pro Glu Pro 
40 135 140 145 

cga gtg gcg aag aaa ctg gga atg att gee ggc ggg aca gga ate ace 638 
Arg Val Ala Lys Lys Leu Gly Met lie Ala Gly Gly Thr Gly lie Thr 

150 155 160 

cca atg eta cag ctg ate egg gee ate ctg aaa gtc cct gaa gat cca 686 
45 Pro Met Leu Gin Leu lie Arg Ala lie Leu Lys Val Pro Glu Asp Pro 
165 170 175 

ace cag tgc ttt ctg ctt ttt gee aac cag aca gaa aag gat ate ate 734 
Thr Gin Cys Phe Leu Leu Phe Ala Asn Gin Thr Glu Lys Asp lie lie 
180 185 190 195 

50 ttg egg gag gac tta gag gaa ctg cag gee cgc tat ccc aat cgc ttt 782 
Leu Arg Glu Asp Leu Glu Glu Leu Gin Ala Arg Tyr Pro Asn Arg Phe 

200 205 210 

aag etc tgg ttc act ctg gat cat ccc cca aaa gat tgg gee tac age 830 
Lys Leu Trp Phe Thr Leu Asp His Pro Pro Lys Asp Trp Ala Tyr Ser 
55 215 220 225 

aa 9 99 c ttt gtg act gee gac atg ate egg gaa cac ctg ccc get cca 878 
Lys Gly Phe Val Thr Ala Asp Met lie Arg Glu His Leu Pro Ala Pro 

230 235 240 

ggg gat gat gtg ctg gta ctg ctt tgt ggg cca ccc cca atg gtg cag 926 
60 Gly Asp Asp Val Leu Val Leu Leu Cys Gly Pro Pro Pro Met Val Gin 
245 250 255 

ctg gee tgc cat ccc aac ttg gac aaa ctg ggc tac tea caa aag atg 974 
Leu Ala Cys His Pro Asn Leu Asp Lys Leu Gly Tyr Ser Gin Lys Met 
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260 265 270 275 

cga ttc acc tac tgagcatcct ccagcttccc tggtgctgtt cgctgcagtt 1026 

Arg Phe Thr Tyr 

gttccccatc agtactcaag cactataagc cttagattcc tttcctcaga gtttcaggtt 1086 
5 ttttcagtta catctagagc tgaaatctgg atagtacctg caggaacaat attcctgtag 1146 
ccatggaaga gggccaaggc tcagtcactc cttggatggc ctcctaaatc tccccgtggc 1206 
aacaggtcca ggagaggccc atggagcagt ctcttccatg gagtaagaag gaagggagca 1266 
tgtacgcttg gtccaagatt ggctagttcc ttgatagcat cttactctca ccttctttgt 1326 
gtctgtgatg aaaggaacag tctgtgcaat gggttttact taaacttcac tgttcaacct 1386 
10 atgagcaaat ctgtatgtgt gagtataagt tgagcatagc atacttccag aggtggtctt 1446 
atggagatgg caagaaagga ggaaatgatt tcttcagatc tcaaaggagt ctgaaatatc 1506 
atatttctgt gtgtgtctct ctcagcccct gcccaggcta gagggaaaca gctactgata 1566 
atcgaaaact gctgtttgtg gcaggaaccc ctggctgtgc aaataaatgg ggctgaggcc 1626 
cctgtgtgat attaaaaaaa aaaaaaaaaa a 1657 

15 

<210> 45 

<211> 1733 

<212> DNA 

<213> Homo sapiens 

20 

<220> 
<221> CDS 
<222> 157 . . 1482 

25 <220> 

<221> sig_peptide 

<222> 157..219 

<223> Von Heijne matrix 

score 11.6921972463885 
30 seq LLLCLALSGAAET/KP 



<400> 45 

aaagaaaagt cggcagcaga gggaacaggg aagaaaccta aaggctgcag gctgccaggt 60 

gtgcttggag agcccccttc ttccgccggg cctcgcaagc agcgtaggac tgtggagaag 120 

35 ggcggtgggc aaggagggaa ctcgagagca gcctcc atg ggc aca cag gag ggc 174 

Met Gly Thr Gin Glu Gly 
-20 

tgg tgc ctg ctg etc tgc ctg get eta tct gga gca gca gaa acc aag 222 

Trp Cys Leu Leu Leu Cys Leu Ala Leu Ser Gly Ala Ala Glu Thr Lys 

40 -15 -10 -5 1 

ccc cac cca gca gag ggg cag ttg egg gca gtg gac gtg gtc eta gac 270 

Pro His Pro Ala Glu Gly Gin Leu Arg Ala Val Asp Val Val Leu Asp 

5 10 15 

tgc ttc ctg gcg aag gac ggt gcg cac cgt gga get etc gee age agt 318 

45 Cys Phe Leu Ala Lys Asp Gly Ala His Arg Gly Ala Leu Ala Ser Ser 
20 25 30 

gag gac agg gca agg gee tec ctt gtg ctg aag cag gtg cca gtg ctg 366 

Glu Asp Arg Ala Arg Ala Ser Leu Val Leu Lys Gin Val Pro Val Leu 
35 40 45 

50 gac gat ggc tec ctg gag gac ttc acc gat ttc caa ggg ggc aca ctg 414 

Asp Asp Gly Ser Leu Glu Asp Phe Thr Asp Phe Gin Gly Gly Thr Leu 

50 55 60 65 

gec caa gat gac cca cct att ate ttt gag gee tea gtg gac ctg gtc 462 

Ala Gin Asp Asp Pro Pro He He Phe Glu Ala Ser Val Asp Leu Val 

55 70 75 80 

cag att ccc cag gee gag gee ttg etc cat get gac tgc agt ggg aag 510 

Gin He Pro Gin Ala Glu Ala Leu Leu His Ala Asp Cys Ser Gly Lys 

85 90 95 

gag gtg acc tgt gag ate tec cgc tac ttt etc cag atg aca gag acc 558 

60 Glu Val Thr Cys Glu He Ser Arg Tyr Phe Leu Gin Met Thr Glu Thr 
100 105 110 

act gtt aag aca gca get tgg ttc atg gec aac atg cag gtc tct gga 606 

Thr Val Lys Thr Ala Ala Trp Phe Met Ala Asn Met Gin Val Ser Gly 
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115 120 125 

999 99 a cst a 9 c atc tcc fct 9 9 fc 9 at 9 aa 9 act ccc a 99 9 tc acc aa 9 654 
Gly Gly Xaa Ser lie Ser Leu Val Met Lys Thr Pro Arg Val Thr Lys 
130 135 140 145 

5 aat gag gcg etc tgg cac ccg acg ctg aac ttg cca ctg age ccc cag 702 
Asn Glu Ala Leu Trp His Pro Thr Leu Asn Leu Pro Leu Ser Pro Gin 
150 155 160 

999 act 9 fc 9 c 9 a act 9 ca 9 fc 9 9 a 9 ttc ca 9 9 fc 9 at 9 aca ca 9 acc caa 750 
Gly Thr Val Arg Thr Ala Val Glu Phe Gin Val Met Thr Gin Thr Gin 

10 165 170 175 

tcc ctg age ttc ctg ctg ggg tcc tea gec tcc ttg gac tgt ggc ttc 798 
Ser Leu Ser Phe Leu Leu Gly Ser Ser Ala Ser Leu Asp Cys Gly Phe 

180 185 190 

tcc atg gca ccg ggc ttg gac etc atc agt gtg gag tgg cga ctg cag 846 

15 Ser Met Ala Pro Gly Leu Asp Leu lie Ser Val Glu Trp Arg Leu Gin 
195 200 205 

cac aag ggc agg ggt cag ttg gtg tac age tgg acc gca ggg cag ggg 8 94 

His Lys Gly Arg Gly Gin Leu Val Tyr Ser Trp Thr Ala Gly Gin Gly 
210 215 220 225 

20 cag get gtg egg aag ggc get acc ctg gag cct gca caa ctg ggc atg 942 
Gin Ala Val Arg Lys Gly Ala Thr Leu Glu Pro Ala Gin Leu Gly Met 

230 235 240 

gee agg gat gee tcc etc acc ctg ccc ggc etc act ata cag gac gag 990 
Ala Arg Asp Ala Ser Leu Thr Leu Pro Gly Leu Thr lie Gin Asp Glu 

25 245 250 255 

ggg acc tac att tgc cag atc acc acc tct ctg tac cga get cag cag 1038 
Gly Thr Tyr lie Cys Gin lie Thr Thr Ser Leu Tyr Arg Ala Gin Gin 

260 265 270 

atc atc cag etc aac atc caa get tcc cct aaa gta cga ctg age ttg 1086 

30 lie lie Gin Leu Asn lie Gin Ala Ser Pro Lys Val Arg Leu Ser Leu 
275 280 285 

gca aac gaa get ctg ctg ccc acc etc atc tgc gac att get ggc tat 1134 
Ala Asn Glu Ala Leu Leu Pro Thr Leu lie Cys Asp lie Ala Gly Tyr 
290 295 300 305 

35 tac cct ctg gat gtg gtg gtg acg tgg acc cga gag gag ctg ggt gga 1182 
Tyr Pro Leu Asp Val Val Val Thr Trp Thr Arg Glu Glu Leu Gly Gly 

310 315 320 

tcc cca gee caa gtc tct ggt gee tcc ttc tcc age etc agg caa age 1230 
Ser Pro Ala Gin Val Ser Gly Ala Ser Phe Ser Ser Leu Arg Gin Ser 

40 325 330 335 

gtg gca ggc acc tac age atc tcc tcc tct etc acc gca gaa cct ggc 1278 
Val Ala Gly Thr Tyr Ser lie Ser Ser Ser Leu Thr Ala Glu Pro Gly 

340 345 350 

tct gca ggt gee act tac acc tgc cag gtc aca cac atc tct ctg gag 1326 

45 Ser Ala Gly Ala Thr Tyr Thr Cys Gin Val Thr His lie Ser Leu Glu 
355 360 365 

gag ccc ctt ggg gee age acc cag gtt gtc cca cca gag egg aga aca 1374 
Glu Pro Leu Gly Ala Ser Thr Gin Val Val Pro Pro Glu Arg Arg Thr 
370 375 380 385 

50 gee ttg gga gtc atc ttt gee age agt etc ttc ctt ctt gca ctg atg 1422 
Ala Leu Gly Val lie Phe Ala Ser Ser Leu Phe Leu Leu Ala Leu Met 

390 395 400 

ttc ctg ggg ctt cag aga egg caa gca cct aca gga ctt ggg ctg ctt 1470 
Phe Leu Gly Leu Gin Arg Arg Gin Ala Pro Thr Gly Leu Gly Leu Leu 

55 405 410 415 

cag get gaa cgc taggagacca cttcctgtgc tgacacacag agctcccatc 1522 
Gin Ala Glu Arg 
420 

tccatgaaga ccgcacagcg cgtgtaagcc agcccagctg acctaaagcg acatgagact 1582 
60 actagaaaga aacgacaccc ttccccaagc ccccacagct actccaaccc aaacaacaac 1642 
caagecagtt taatggtagg aatttgtatt ttttgccttt gttcagaata catgacattg 1702 
gtaaatatgc cacaaaaaaa aaaaaaaaaa a 1733 
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<210> 46 
<211> 1871 
<212> DNA 

<213> Homo sapiens 

5 

<220> 

<221> CDS 

<222> 195. .1052 

10 <220> 

<221> sig_peptide 

<222> 195 . . 338 

<223> Von Heijne matrix 

score 3.50178852790004 
15 seq LGVFWCHQLSSS/LN 

<400> 46 

agtgactgcc gggagtcctg caggggcggg gcggcgccaa gcgcagggag cccggctgag 60 
tggcagccca gattgaagat ggatacgtga caatcccagg gaccgctgca ctgacttcat 120 
20 ttccttagac aagacacagt gtagggcccg gcccgtgttg gccccaggac tcctttggaa 180 

tatagctgtg gaca atg aat cct gcg age gat ggg ggc aca tea gag age 230 

Met Asn Pro Ala Ser Asp Gly Gly Thr Ser Glu Ser 
-45 -40 

att ttt gac ctg gac tat gca tec tgg ggg ate cgc tec acg ctg atg 278 

25 lie Phe Asp Leu Asp Tyr Ala Ser Trp Gly lie Arg Ser Thr Leu Met 

-35 -30 -25 

gtc get ggc ttt gtc ttc tac ttg ggc gtc ttt gtg gtc tgc cac cag 326 

Val Ala Gly Phe Val Phe Tyr Leu Gly Val Phe Val Val Cys His Gin 

-20 -15 -10 -5 

30 ctg tec tct tec ctg aat gee act tac cgt tct ttg gtg gec aga gag 374 

Leu Ser Ser Ser Leu Asn Ala Thr Tyr Arg Ser Leu Val Ala Arg Glu 

1 5 10 

aag gtc ttc tgg gac ctg gcg gec acg cgt gca gtc ttt ggt gtt cag 422 

Lys Val Phe Trp Asp Leu Ala Ala Thr Arg Ala Val Phe Gly Val Gin 

35 15 20 25 

age aca gee gca ggc ctg tgg get ctg ctg ggg gac cct gtg ctg cat 470 

Ser Thr Ala Ala Gly Leu Trp Ala Leu Leu Gly Asp Pro Val Leu His 

30 35 40 

gee gac aag gcg cgt ggc cag cag aac tgg tgc tgg ttt cac ate acg 518 

40 Ala Asp Lys Ala Arg Gly Gin Gin Asn Trp Cys Trp Phe His lie Thr 

45 50 55 60 

aca gca acg gga ttc ttt tgc ttt gaa aat gtt gca gtc cac ctg tec 566 

Thr Ala Thr Gly Phe Phe Cys Phe Glu Asn Val Ala Val His Leu Ser 

65 70 75 

45 aac ttg ate ttc egg aca ttt gac ttg ttt ctg gtt ate cac cat etc 614 

Asn Leu lie Phe Arg Thr Phe Asp Leu Phe Leu Val lie His His Leu 

80 85 90 

ttt gee ttt ctt ggg ttt ctt ggc tgc ttg gtc aat etc caa get ggc 662 

Phe Ala Phe Leu Gly Phe Leu Gly Cys Leu Val Asn Leu Gin Ala Gly 

50 95 100 105 

cac tat eta get atg ace acg ttg etc ctg gag atg age acg ccc ttt 710 

His Tyr Leu Ala Met Thr Thr Leu Leu Leu Glu Met Ser Thr Pro Phe 

110 115 120 

acc tgc gtt tec tgg atg etc tta aag gcg ggc tgg tec gag tct ctg 758 

55 Thr Cys Val Ser Trp Met Leu Leu Lys Ala Gly Trp Ser Glu Ser Leu 

125 130 135 140 

ttt tgg aag etc aac cag tgg ctg atg att cac atg ttt cac tgc cgc 806 

Phe Trp Lys Leu Asn Gin Trp Leu Met He His Met Phe His Cys Arg 

145 150 155 

60 atg gtt eta acc tac cac atg tgg tgg gtg tgt ttc tgg cac tgg gac 854 
Met Val Leu Thr Tyr His Met Trp Trp Val Cys Phe Trp His Trp Asp 

160 165 170 

ggc ctg gtc age age ctg tat ctg cct cat ttg aca ctg ttc ctt gtc 902 

56 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 



PCT/IB00/01938 



Gly Leu Val Ser Ser Leu Tyr Leu Pro His Leu Thr Leu Phe Leu Val 

175 180 185 

gga ctg get ctg ctt acg eta ate att aat cca tat tgg ace cat aag 
Gly Leu Ala Leu Leu Thr Leu lie lie Asn Pro Tyr Trp Thr His Lys 
5 190 195 200 

aag act cag cag ctt etc aat ccg gtg gac tgg aac ttc gca cag cca 
Lys Thr Gin Gin Leu Leu Asn Pro Val Asp Trp Asn Phe Ala Gin Pro 
205 210 215 220 

gaa gec aag age agg cca gaa ggc aac ggg cag ctg ctg egg aag aag 
10 Glu Ala Lys Ser Arg Pro Glu Gly Asn Gly Gin Leu Leu Arg Lys Lys 

225 230 235 

agg cca tagctgctcc ageegggget ccggggcggc agcagagctg gcacaccgat 
Arg Pro 

tctgggaagc cccgcgaatg atggcttttg aattaatgag gcagtgaatg ttttgtgttt 
15 acttctaagg gaaatactaa ctttctttcg cattagtatt aattttgaag tagctacaaa 
gtatttttaa gaaattataa ttttatgact gtctggcagg ctctgtcagt ttagccgcgc 
cggaccgtgt caagcatcta ggagaggagt ccatggtgtc caggcategg ggcgtcacac 
ctgttgagga gtggggtggc tttgaatget ggaaatggct tcatagtgaa gtgcctccca 
cagggegggt gggtcagcgt tgactctttc cagctgcaca ctcatatgcc gtgtgtctta 
20 ttcagaagtc acattctttt cagttggaga gaattgggct aagatagaaa ataacatgat 
ttgttcctta ttaaagtttc ecagegtatg aaattctaag ctgggtgggg tggctcacac 
ccgacgtaat cccagcacgt tgggaggccg aggcaggtgg atcacttgag gecaggagtt 
cgagaccagc ctggtcaaga tggtgaaacc ccatctctac taaaattaca aaaattagee 
999 t 9 tc 9 t 9 gcacacacct gtaatcccag ctatttggga ggecaaggea ggagaattgc 
25 ctgaacccgg gaggeggagg ttgcagtgag ctgagatege accactgcac tccagcactc 
cagcctgggt gaeggagcaa cactctctcg caaaaaaaaa aaaaaaaaa 



950 



998 



1046 



1102 

1162 
1222 
1282 
1342 
1402 
1462 
1522 
1582 
1642 
1702 
1762 
1822 
1871 



<210> 47 

<211> 1523 

30 <212> DNA 

<213> Homo sapiens 



35 



<220> 
<221> CDS 
<222> 217 



, 1410 



<220> 

<221> sig — peptide 

<222> 217 . .279 
40 <223> Von Heijne matrix 

score 5.8172934575094 
seq ALLWAQEVGQVLA/GR 



45 



50 



<400> 47 

acttccccgg gagceggaag tcccgtctca 
agteggcage cctgtggcag ccggcgggct 
cagctgctgc atcccatggc caggggtggc 
ggcctgaacc tggggecaga cacccctctc 



gta 
Val 
-15 
cgt 
55 Arg 

etc 
Leu 

60 tec 
Ser 

agg 



cct gee tta 
Pro Ala Leu 



gee 
Ala 

ctt 
Leu 

tat 
Tyr 
35 
acc 



cgc 
Arg 

ttg 

Leu 

20 

atg 

Met 



agg 
Arg 
5 

etc 
Leu 

ccg 
Pro 



gac tgt 



ctg tgg gec cag 
Leu Trp Ala Gin 
-10 

ctg ctg ctg cag 
Leu Leu Leu Gin 

tgg gtg tct gtc 
Trp Val Ser Val 
25 

aca gtc age cac 
Thr Val Ser His 
40 

gat tec tec acc 



cggttgccct 
ggtttccatg 
gtccaggtgg 
ccggcc atg 
Met 

gag gtg ggc 
Glu Val Gly 
-5 

ttt ggg gtg 
Phe Gly Val 
10 

ttc etc tat 
Phe Leu Tyr 

etc age cct 
Leu Ser Pro 

acc tea etc 



ggcagcgcgc gaggctggtg 
gttgeacgat taggaaccac 
cagagcagct aggaaegcaa 
gtc aac gac cct cca 
Val Asn Asp Pro Pro 
-20 

caa gtc ttg gca ggc 
Gin Val Leu Ala Gly 
1 

etc ttc tgc acc ate 
Leu Phe Cys Thr lie 
15 

ggc tec ttc tac tat 
Gly Ser Phe Tyr Tyr 
30 

gtg cat ttc tac tac 
Val His Phe Tyr Tyr 
45 

tgc tec ttc cct gtt 



60 
120 
180 
234 



282 



330 



378 



426 



474 
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Arg Thr Asp Cys Asp Ser Ser Thr Thr Ser Leu Cys Ser Phe Pro Val 
50 55 60 65 

gcc aat gtc teg ctg act aag ggt gga cgt gat egg gtg ctg atg tat 522 
Ala Asn Val Ser Leu Thr Lys Gly Gly Arg Asp Arg Val Leu Met Tyr 
5 70 75 80 

gga cag ccg tat cgt gtt ace tta gag ctt gag ctg cca gag tec cct 570 
Gly Gin Pro Tyr Arg Val Thr Leu Glu Leu Glu Leu Pro Glu Ser Pro 

85 90 95 

gtg aat caa gat ttg ggc atg ttc ttg gtc acc att tec tgc tac ace 618 
10 Val Asn Gin Asp Leu Gly Met Phe Leu Val Thr lie Ser Cys Tyr Thr 
100 105 110 

aga ggt ggc cga ate ate tec act tct teg cgt teg gtg atg ctg cat 666 
Arg Gly Gly Arg lie lie Ser Thr Ser Ser Arg Ser Val Met Leu His 
115 120 125 

15 tac cgc tea gac ctg etc cag atg ctg gac aca ctg gtc ttc tct age 714 
Tyr Arg Ser Asp Leu Leu Gin Met Leu Asp Thr Leu Val Phe Ser Ser 
130 " 135 140 145 

etc ctg eta ttt ggc ttt gca gag cag aag cag ctg ctg gag gtg gaa 762 
Leu Leu Leu Phe Gly Phe Ala Glu Gin Lys Gin Leu Leu Glu Val Glu 
20 150 155 160 

etc tac gca gac tat aga gag aac teg tac gtg ccg acc act gga gcg 810 
Leu Tyr Ala Asp Tyr Arg Glu Asn Ser Tyr Val Pro Thr Thr Gly Ala 

165 170 175 

ate att gag ate cac age aag cgc ate cag ctg tat gga gcc tac etc 858 
25 lie lie Glu lie His Ser Lys Arg lie Gin Leu Tyr Gly Ala Tyr Leu 
180 185 190 

cgc ate cac gcg cac ttc act ggg etc aga tac ctg eta tac aac ttc 906 
Arg lie His Ala His Phe Thr Gly Leu Arg Tyr Leu Leu Tyr Asn Phe 
195 200 205 

30 ccg atg acc tgc gcc ttc ata ggt gtt gcc age aac ttc acc ttc etc 954 
Pro Met Thr Cys Ala Phe lie Gly Val Ala Ser Asn Phe Thr Phe Leu 
210 215 220 225 

age gtc ate gtg etc ttc age tac atg cag tgg gtg tgg ggg ggc ate 1002 
Ser Val lie Val Leu Phe Ser Tyr Met Gin Trp Val Trp Gly Gly lie 
35 230 235 240 

tgg ccc cga cac cgc ttc tct ttg cag gtt aac ate cga aaa aga gac 1050 
Trp Pro Arg His Arg Phe Ser Leu Gin Val Asn lie Arg Lys Arg Asp 

245 250 255 

aat tec egg aag gaa gtc caa cga agg ate tct get cat cag cca ggg 1098 
40 Asn Ser Arg Lys Glu Val Gin Arg Arg lie Ser Ala His Gin Pro Gly 
260 265 270 

cct gaa ggc cag gag gag tea act ccg caa tea gat gtt aca gag gat 1146 
Pro Glu Gly Gin Glu Glu Ser Thr Pro Gin Ser Asp Val Thr Glu Asp 
275 280 285 

45 ggt gag age cct gaa gat ccc tea ggg aca gag ggt cag ctg tec gag 1194 
Gly Glu Ser Pro Glu Asp Pro Ser Gly Thr Glu Gly Gin Leu Ser Glu 
290 295 300 305 

gag gag aaa cca gat cag cag ccc ctg age gga gaa gag gag eta gag 1242 
Glu Glu Lys Pro Asp Gin Gin Pro Leu Ser Gly Glu Glu Glu Leu Glu 
50 310 315 320 

cct gag gcc agt gat ggt tea ggc tec tgg gaa gat gca get ttg ctg 1290 
Pro Glu Ala Ser Asp Gly Ser Gly Ser Trp Glu Asp Ala Ala Leu Leu 

325 330 335 

acg gag gcc aac ctg cct get cct get cct get tct get tct gcc cct 1338 
55 Thr Glu Ala Asn Leu Pro Ala Pro Ala Pro Ala Ser Ala Ser Ala Pro 
340 345 350 

gtc eta gag act ctg ggc age tct gaa cct get ggg ggt get etc cga 13 86 

Val Leu Glu Thr Leu Gly Ser Ser Glu Pro Ala Gly Gly Ala Leu Arg 
355 360 365 

60 cag cgc ccc acc tgc tct agt tec tgaagaaaag gggcagactc ctcacattcc 1440 
Gin Arg Pro Thr Cys Ser Ser Ser 
370 375 

agcactttcc cacctgactc ctctcccctc gtttttcctt caataaacta ttttgtgtca 1500 
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gcttcgaaaa aaaaaaaaaa aaa 



1523 



<210> 48 
<211> 832 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 103 . .492 

<220> 

<221> sig_jpeptide 

<222> 103. . 162 
15 <223> Von Heijne matrix 

score 8.21832452871462 
seq LFFCYLLLFTCSG/VE 



<400> 48 

20 gtttactcgc tgctgtgccc atctatcagc aggctccggg 
ctcctccaag gtctagtgac ggagcccgcg cgcggcgcca 



gcg gta teg ctt ttc ttc tgc tac ctg ctg etc 

25 Ala Val Ser Leu Phe Phe Cys Tyr Leu Leu Leu 
-15 -10 
gtg gag gca ggt aag aaa aag tgc teg gag age 
Val Glu Ala Gly Lys Lys Lys Cys Ser Glu Ser 
15 10 

30 ggg ttc tgg aag gee ctg acc ttc atg gee gtc 
Gly Phe Trp Lys Ala Leu Thr Phe Met Ala Val 

20 25 
gtc gec ggg ctg ccc gcg ctg ggc ttc acc ggc 
Val Ala Gly Leu Pro Ala Leu Gly Phe Thr Gly 

35 35 40 

aac teg gtg get gec teg ctg atg age tgg tct 
Asn Ser Val Ala Ala Ser Leu Met Ser Trp Ser 

50 55 
ggc ggc gtg ccc gee ggg ggg eta gtg gec acg 

40 Gly Gly Val Pro Ala Gly Gly Leu Val Ala Thr 
65 70 75 

get ggt ggc age age gtc gtc ata ggt aat att 
Ala Gly Gly Ser Ser Val Val lie Gly Asn lie 
85 90 

45 tac gee acc cac aag tat etc gat agt gag gag 
Tyr Ala Thr His Lys Tyr Leu Asp Ser Glu Glu 

100 105 
tagecagcag ctcccagaac ctcttcttcc ttcttggcct 
agaactttgc cttttttttt tttttttttt tttttttgag 

50 ccaggctaga gtgcagtggc tattcacaga tgegaacata 
cctagcctca agtgatcctc ctgtctcaac ctcccaagta 
cgatgcccag aatccagaac tttgtctatc actctcccca 
gaataaactt cacccagaaa gcaaaaaaaa aaaaaaaaaa 

55 <210> 49 
<211> 831 
<212> DNA 
<213> Homo sapiens 

60 <220> 

<221> CDS 
<222> 234. .491 



ctgaagattg cttctcttct 60 
cc atg egg cag aag 114 

Met Arg Gin Lys 

-20 

ttc act tgc agt ggg 162 

Phe Thr Cys Ser Gly 

-5 

teg gac age ggc tec 210 
Ser Asp Ser Gly Ser 
15 

gga gga gga etc gca 258 
Gly Gly Gly Leu Ala 
30 

gee ggc ate gcg gee 306 
Ala Gly lie Ala Ala 
45 

gcg ate ctg aat ggg 354 

Ala lie Leu Asn Gly 

60 

ctg cag age etc ggg 402 
Leu Gin Ser Leu Gly 
80 

ggt gee ctg atg ggc 450 
Gly Ala Leu Met Gly 
95 

gat gag gag 4 92 

Asp Glu Glu 
110 

aactcttcca gttaggatct 552 
atgggttctc actatattgt 612 
gtacactgea gcctccaact 672 
ggattacaag catgcgccga 732 
acaacctaga tgtgaaaaca 792 

832 
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<220> 

<221> sig_peptide 
<222> 234 . .293 
<223> Von Heijne matrix 
5 score 4.85037394589162 

seq AVAGLPALGFTGA/GI 

<400> 49 

gtttactcgc tgctgtgccc atctatcagc aggctccggg ctgaagattg cttctcttct 60 
10 ctcctccaag gtctagtgac ggagcccgcg cgcggcgcca ccatgcggca gaaggcggta 120 
tcgcttttct ctgctacctg ctgctcttca cttgcagtgg ggtggaggca ggtaagaaaa 180 
agtgctcgga gagctcggac agcggctccg ggttctggaa ggccctgacc ttc atg 236 

Met 
-20 

15 gcc gtc gga gga gga etc gca gtc gec ggg ctg ccc gcg ctg ggc ttc 284 
Ala Val Gly Gly Gly Leu Ala Val Ala Gly Leu Pro Ala Leu Gly Phe 

-15 -10 -5 

acc ggc gcc ggc ate gcg gcc aac teg gtg get gcc teg ctg atg age 332 
Thr Gly Ala Gly lie Ala Ala Asn Ser Val Ala Ala Ser Leu Met Ser 
20 l 5 10 

tgg tct gcg ate ctg aat ggg ggc ggc gtg ccc gcc ggg ggg eta gtg 3 80 

Trp Ser Ala lie Leu Asn Gly Gly Gly Val Pro Ala Gly Gly Leu Val 

15 2 0 2 5 

gcc acg ctg cag age etc ggg get ggt ggc age age gtc gtc ata ggt 428 
25 Ala Thr Leu Gin Ser Leu Gly Ala Gly Gly Ser Ser Val Val He Gly 
30 35 40 45 

aat att ggt gcc ctg atg ggc tac gcc acc cac aag tat etc gat agt 476 
Asn He Gly Ala Leu Met Gly Tyr Ala Thr His Lys Tyr Leu Asp Ser 
50 55 60 

30 gag gag gat gag gag tagecagcag ctcccagaac ctcttcttcc ttcttggcct 531 
Glu Glu Asp Glu Glu 
65 

aactcttcca gttaggatct agaactttgc cttttttttt tttttttttt tttttttgag 591 
atgggttctc actatattgt ccaggctaga gtgcagtggc tattcacaga tgegaacata 651 
35 gtacactgea gcctccaact cctagcctca agtgatcctc ctgtctcaac ctcccaagta 711 
ggattacaag catgcgccga cgatgcccag aatccagaac tttgtctatc actctcccca 771 
acaacctaga tgtgaaaaca gaataaactt cacccagaaa gcaaaaaaaa aaaaaaaaaa 831 

<210> 50 
40 <211> 917 
<212> DNA 
<213> Homo sapiens 

<220> 
45 <221> CDS 

<222> 180 . . 800 

<220> 

<221> sig_peptide 
50 <222> 180. .248 

<223> Von Heijne matrix 

score 14.6828672385356 

seq ILLLLWLIAPSRA/CT 

55 <400> 50 

acccttggct tetgeactga tggtgggtgg atgagtaatg catccaggaa gectggagge 60 
ctgtggtttc cgcacccgct gccacccccg cccctagcgt ggacatttat cctctagcgc 120 
tcaggccctg ccgccatcgc cgcagatcca gcgcccagag agacaccaga gaacccacc 179 
atg gcc ccc ttt gag ccc ctg get tct ggc ate ctg ttg ttg ctg tgg 227 
60 Met Ala Pro Phe Glu Pro Leu Ala Ser Gly He Leu Leu Leu Leu Trp 
-20 -15 " 10 .. 

ctg ata gcc ccc age agg gcc tgc acc tgt gtc cca ccc cac cca cag 275 
Leu He Ala Pro Ser Arg Ala Cys Thr Cys Val Pro Pro His Pro Gin 

60 
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-5 15 
acg gcc ttc tgc aat tec gac etc gtc ate agg gee aag ttc gtg ggg 323 
Thr Ala Phe Cys Asn Ser Asp Leu Val lie Arg Ala Lys Phe Val Gly 
10 15 20 25 

5 aca cca gaa gtc aac cag ace ace tta tac cag cgt tat gag ate aag 371 
Thr Pro Glu Val Asn Gin Thr Thr Leu Tyr Gin Arg Tyr Glu lie Lys 

30 35 40 

atg ace aag atg tat aaa ggg ttc caa gcc tta ggg gat gcc get gac 419 
Met Thr Lys Met Tyr Lys Gly Phe Gin Ala Leu Gly Asp Ala Ala Asp 
10 45 50 55 

ate egg ttc gtc tac acc ccc gcc atg gag agt gtc tgc gga tac ttc 467 
lie Arg Phe Val Tyr Thr Pro Ala Met Glu Ser Val Cys Gly Tyr Phe 

60 65 70 

cac agg tec cac aac cgc age gag gag ttt etc att get gga aaa ctg 515 
15 His Arg Ser His Asn Arg Ser Glu Glu Phe Leu lie Ala Gly Lys Leu 
75 80 85 

cag gat gga etc ttg cac ate act acc tgc agt ttt gtg get ccc tgg 563 
Gin Asp Gly Leu Leu His lie Thr Thr Cys Ser Phe Val Ala Pro Trp 
90 95 100 105 

20 aac age ctg age tta get cag cgc egg ggc ttc acc aag acc tac act 611 
Asn Ser Leu Ser Leu Ala Gin Arg Arg Gly Phe Thr Lys Thr Tyr Thr 

110 115 120 

gtt ggc tgt gag gaa tgc aca gtg ttt ccc tgt tta tec ttc ccc tgc 659 
Val Gly Cys Glu Glu Cys Thr Val Phe Pro Cys Leu Ser Phe Pro Cys 
25 125 130 135 

aaa ctg cag agt ggc act cat tgc ttg tgg acg gac cag etc etc caa 707 
Lys Leu Gin Ser Gly Thr His Cys Leu Trp Thr Asp Gin Leu Leu Gin 

140 145 150 

ggc tct gaa aag ggc ttc cag tec cgt cac ctt gcc tgc ctg cct egg 755 
30 Gly Ser Glu Lys Gly Phe Gin Ser Arg His Leu Ala Cys Leu Pro Arg 
155 160 165 

gag cca ggg ctg tgc acc tgg cag tec ctg egg tec cag ata gcc 800 
Glu Pro Gly Leu Cys Thr Trp Gin Ser Leu Arg Ser Gin lie Ala 
170 175 180 

35 tgaatcctgc ccggagtgga agetgaagee tgcacagtgt ccaccctgtt cccactccca 860 
tctttcttcc ggacaatgaa ataaagagtt accacccagc aaaaaaaaaa aaaaaaa 917 

<210> 51 

<211> 621 

40 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
45 <222> 140. .472 

<220> 

<221> sig_peptide 
<222> 140. .211 
50 <223> Von Heijne matrix 

score 8.44884907465122 

seq FWFSLFLICAMA/GD 

<400> 51 

55 attttttttt catatctgac atttctatgt ectatgaegg tttcacagct atcctacttt 60 
ggagaagatg ctggaaattc agagtttccg ccagagaata tatgectgaa ctaaaagagg 120 
aagtggtcta taggagaaa atg aaa tat gat tgt ccc ttc agt ggg aca tea 172 

Met Lys Tyr Asp Cys Pro Phe Ser Gly Thr Ser 
-20 -15 
60 ttt gtg gtc ttc tct etc ttt ttg ate tgt gca atg get gga gat gta 220 
Phe Val Val Phe Ser Leu Phe Leu lie Cys Ala Met Ala Gly Asp Val 

-10 -5 1 

gtc tac get gac ate aaa act gtt egg act tec ccg tta gaa etc gcg 268 

61 
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25 



Val Tyr Ala Asp lie Lys Thr Val Arg Thr Ser Pro Leu Glu Leu Ala 

5 .10 15 

ttt cca ctt cag aga tct gtt tct ttc aac ttt tct act gtc cat aaa 316 
Phe Pro Leu Gin Arg Ser Val Ser Phe Asn Phe Ser Thr Val His Lys 
5 20 25 30 35 

tea tgt cct gec aaa gac tgg aag gtg cat aag gga aaa tgt tac tgg 364 
Ser Cys Pro Ala Lys Asp Trp Lys Val His Lys Gly Lys Cys Tyr Trp 

40 45 50 

att get gaa act aag aaa tct tgg aac aaa agt caa aat gac tgt gec 412 
10 lie Ala Glu Thr Lys Lys Ser Trp Asn Lys Ser Gin Asn Asp Cys Ala 
55 60 65 

ata aac aat tea tat etc atg gtg att caa gac att act get atg gtg 460 
lie Asn Asn Ser Tyr Leu Met Val lie Gin Asp lie Thr Ala Met Val 
70 75 80 

15 aga ttt aac att tagaggtgac agcatccccc acactggcag ttaatttttt 512 
Arg Phe Asn lie 
85 

gtctacaaac ttggcaaaag tctgtgaaaa gaagtttcaa cttcatgtgt tattaactat 572 
acaaatatta gttgaatgaa ttgttgaatt acaaaaaaaa aaaaaaaaa 621 

20 

<210> 52 
<211> 673 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 68 . -484 

30 <220> 

<221> sig_peptide 

<222> 68 . . 112 

<223> Von Heijne matrix 

score 4.93618539864455 
35 seq AWFVFSLLDCCA/LI 

<400> 52 

ctatcagggg tgggtcgggg catccgagcg ggtttgacgg aaggagegge ggegaeggag 60 
gaggagg atg gag gcg gtg gtg ttc gtc ttc tct etc etc gat tgt tgc 109 
40 Met Glu Ala Val Val Phe Val Phe Ser Leu Leu Asp Cys Cys 

-15 -10 -5 

gcg etc ate ttc etc teg gtc tac ttc ata att aca ttg tct gat tta 157 
Ala Leu lie Phe Leu Ser Val Tyr Phe lie lie Thr Leu Ser Asp Leu 
15 10 15 

45 gaa tgt gat tac att aat get aga tea tgt tgc tea aaa tta aac aag 2 05 

Glu Cys Asp Tyr lie Asn Ala Arg Ser Cys Cys Ser Lys Leu Asn Lys 

20 25 30 

tgg gta att cca gaa ttg att ggc cat ace att gtc act gta tta ctg 253 
Trp Val lie Pro Glu Leu lie Gly His Thr lie Val Thr Val Leu Leu 
50 35 40 45 

etc atg tea ttg cac tgg ttc ate ttc ctt etc aac tta cct gtt gec 301 
Leu Met Ser Leu His Trp Phe lie Phe Leu Leu Asn Leu Pro Val Ala 

50 55 60 

act tgg aat ata tat cga tac att atg gtg ccg agt ggt aac atg gga 34 9 

55 Thr Trp Asn lie Tyr Arg Tyr lie Met Val Pro Ser Gly Asn Met Gly 
65 70 75 

gtg ttt gat cca aca gaa ata cac aat cga ggg cag ctg aag tea cac 397 
Val Phe Asp Pro Thr Glu lie His Asn Arg Gly Gin Leu Lys Ser His 
80 85 90 95 

60 atg aaa gaa gee atg ate aag ctt ggt ttc cac ttg etc tgc ttc ttc 445 
Met Lys Glu Ala Met lie Lys Leu Gly Phe His Leu Leu Cys Phe Phe 

100 105 110 

atg tat ctt tat agt atg ate tta get ttg ata aat gac tgaagctgga 494 
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Met Tyr Leu Tyr Ser Met lie Leu Ala Leu lie Asn Asp 

115 • • 120 

gaagccgtgg ttgaagtcag cctacactac agtgcacagt tgaggagcca gagacttctt 554 
aaatcatcct tagaaccgtg accatagcag tatatatttt cctcttggaa caaaaaacta 614 
5 tttttgctgt atttttacca tataaagtat ttaaaaaaca cgaaaaaaaa aaaaaaaaa 673 

<210> 53 

<211> 897 

<212> DNA 

10 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 38. .517 

15 

<220> 

<221> sig_peptide 
<222> 38. .118 
<223> Von Heijne matrix 
20 score 7.20400999800742 

seq VLWLSGLSEPGAA/RQ 

<400> 53 

agattgggac agtcgccagg gatggctgag cgtgaag atg cag egg gtg tec ggg 55 
25 Met Gin Arg Val Ser Gly 

-25 

ctg etc tec tgg acg ctg age aga gtc ctg tgg etc tec ggc etc tct 103 
Leu Leu Ser Trp Thr Leu Ser Arg Val Leu Trp Leu Ser Gly Leu Ser 
-20 -15 -10 

30 gag ccg gga get gee egg cag ccc egg ate atg gaa gag aaa gcg eta 151 
Glu Pro Gly Ala Ala Arg Gin Pro Arg lie Met Glu Glu Lys Ala Leu 
-5 15 10 

gag gtt tat gat ttg att aga act ate egg gac cca gaa aag ccc aat 199 
Glu Val Tyr Asp Leu lie Arg Thr lie Arg Asp Pro Glu Lys Pro Asn 

35 15 2 0 25 

act tta gaa gaa ctg gaa gtg gtc teg gaa agt tgt gtg gaa gtt cag 247 
Thr Leu Glu Glu Leu Glu Val Val Ser Glu Ser Cys Val Glu Val Gin 

30 35 40 

gag ata aat gaa gaa gaa tat ctg gtt att ate agg ttc acg cca aca 295 

40 Glu lie Asn Glu Glu Glu Tyr Leu Val lie lie Arg Phe Thr Pro Thr 
45 50 55 

gta cct cat tgc tct ttg gcg act ctt att ggg ctg tgc tta aga gta 343 
Val Pro His Cys Ser Leu Ala Thr Leu lie Gly Leu Cys Leu Arg Val 
60 65 70 75 

45 aaa ctt cag cga tgt tta cca ttt aaa cat aag ttg gaa ate tac att 391 
Lys Leu Gin Arg Cys Leu Pro Phe Lys His Lys Leu Glu lie Tyr lie 

80 85 90 

tct gaa gga ace cac tea aca gaa gaa gac ate aat aag cag ata aat 439 
Ser Glu Gly Thr His Ser Thr Glu Glu Asp lie Asn Lys Gin lie Asn 

50 95 100 105 

gac aaa gag cga gtg gca get gca atg gaa aac ccc aac tta egg gaa 487 
Asp Lys Glu Arg Val Ala Ala Ala Met Glu Asn Pro Asn Leu Arg Glu 

110 115 120 

att gtg gaa cag tgt gtc ctt gaa cct gac tgatagctgt tttaagagee 537 

55 He Val Glu Gin Cys Val Leu Glu Pro Asp 
125 130 
actggcctgt aattgtttga tatatttgta actctttgta taatgtcaga gactcatgtt 597 
taatacatag gtgatttgta cctcagagca ttttttaaag gattctttcc aagegagatt 657 
taattataag gtagtaccta atttgttcaa tgtataacat tctcaggatt tgtaacactt 717 

60 aaatgatcag acagaataat attttctagt tattatgtgt aagatgagtt gctatttttc 777 
tgatgetcat tctgatacaa ctatttttcg tgtcaaatat ctactgtgcc caaatgtact 837 
caatttaaat cattactctg taaaataaat aagcagatga ttcttataaa aaaaaaaaaa 897 
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<210> 54 

<211> 1101 

<212> DNA 

<213> Homo sapiens 

5 

<220> 

<221> CDS 

<222> 92 . . 634 

10 <220> 

<221> sig_peptide 

<222> 92 . . 139 

<223> Von Heijne matrix 

score 7.36306712986597 
15 seq FLLLTCLFITGTS/VS 

<400> 54 

cttaaaaaaa aaagtgcttg aaagagaagg ggacaaagga acaccagtat taagaggatt 60 
ttccagtgtt tctggcagtt ggtccagaag g atg cct cca ttc ctg ctt etc 112 

20 Met Pro Pro Phe Leu Leu Leu 

-15 -10 
acc tgc etc ttc ate aca ggc ace tec gtg tea ccc gtg gee eta gat 160 
Thr Cys Leu Phe lie Thr Gly Thr Ser Val Ser Pro Val Ala Leu Asp 
-5 1 5 

25 cct tgt tct get tac ate age ctg aat gag ccc tgg agg aac act gac 208 
Pro Cys Ser Ala Tyr lie Ser Leu Asn Glu Pro Trp Arg Asn Thr Asp 

10 15 20 

cac cag ttg gat gag tct caa ggt cct cct eta tgt gac aac cat gtg 256 
His Gin Leu Asp Glu Ser Gin Gly Pro Pro Leu Cys Asp Asn His Val 

30 25 30 35 

aat ggg gag tgg tac cac ttc acg ggc atg gcg gga gat gee atg cct 304 

Asn Gly Glu Trp Tyr His Phe Thr Gly Met Ala Gly Asp Ala Met Pro 

40 45 50 55 

acc ttc tgc ata cca gaa aac cac tgt gga acc cac gca cct gtc tgg 352 

35 Thr Phe Cys lie Pro Glu Asn His Cys Gly Thr His Ala Pro Val Trp 

60 65 70 

etc aat ggc age cac ccc eta gaa ggc gac ggc att gtg caa cgc cag 400 
Leu Asn Gly Ser His Pro Leu Glu Gly Asp Gly lie Val Gin Arg Gin 
75 80 85 

40 get tgt gee age ttc aat ggg aac tgc tgt etc tgg aac acc acg gtg 448 
Ala Cys Ala Ser Phe Asn Gly Asn Cys Cys Leu Trp Asn Thr Thr Val 

90 95 100 

gaa gtc aag get tgc cct gga ggc tac tat gtg tat cgt ctg acc aag 496 
Glu Val Lys Ala Cys Pro Gly Gly Tyr Tyr Val Tyr Arg Leu Thr Lys 

45 105 110 115 

ccc age gtc tgc ttc cac gtc tac tgt ggt cgt gag tac ctt ccc tgt 544 

Pro Ser Val Cys Phe His Val Tyr Cys Gly Arg Glu Tyr Leu Pro Cys 

120 125 130 135 

get ctt ttt etc cac caa caa ggc cac agg tgg agt cca aaa gtg ccc 592 

50 Ala Leu Phe Leu His Gin Gin Gly His Arg Trp Ser Pro Lys Val Pro 

140 145 150 

aat tat agg ata tgc agt tac agt ggc aac tat ate tea ate 634 
Asn Tyr Arg lie Cys Ser Tyr Ser Gly Asn Tyr lie Ser lie 
155 160 165 

55 tgaacaacat tgatgtgggg ctaaagatac tctgatttct gagatctctt cttagaactt 694 
ctgaaaaatt cctgaagaaa tagaagggga aaggagctat gactttgatc agttcttttt 75.4 
aattttgtct gaattccatt caaacaaaac attagaaaat gaaacattgg gccaggcgca 814 
gtggctcatg cctgtaatcc cagcactttg ggaggctgag gcgggtggat cacaagatca 874 
ggagtttaag accagcctgg ccaatatggt gaaaccctgt ctctactaga aatacaaaaa 934 

60 ttagacaggc gtggtggcag gcaactgtaa ccccagctac cegggagget gaggcaggag 994 
aattgcttga accegggagg tggacgttgc ggtcaggega aaatcgtgcc attgcactcc 1054 
agcctgggtg acagagtgag actctgattc aaaaaaaaaa aaaaaaa 1101 
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<210> 55 
<211> 1047 
<212> DNA 

<213> Homo sapiens 

5 

<220> 
<221> CDS 
<222> 27 . .767 

10 <220> 

<221> sig_peptide 

<222> 27 . . 80 

<223> Von Heijne matrix 

score 8.96664802487992 
15 seq LFCLAVLAASSFS/KA 

<400> 55 

agcagaggcc ctacacccac cgaggc atg ggg etc cct ggg ctg ttc tgc ttg 53 

Met Gly Leu Pro Gly Leu Phe Cys Leu 
20 -15 -10 

gec gtg ctg get gec age age ttc tec aag gca egg gag gaa gaa att 101 
Ala Val Leu Ala Ala Ser Ser Phe Ser Lys Ala Arg Glu Glu Glu lie 

-5 15 
ace cct gtg gtc tec att gec tac aaa gtc ctg gaa gtt ttc ccc aaa 149 
25 Thr Pro Val Val Ser lie Ala Tyr Lys Val Leu Glu Val Phe Pro Lys 
10 15 20 

ggc cgc tgg gtg etc ata ace tgc tgt gca ccc cag cca cca ccg ccc 197 
Gly Arg Trp Val Leu lie Thr Cys Cys Ala Pro Gin Pro Pro Pro Pro 
25 30 35 

30 ate acc tat tec etc tgt gga acc aag aac ate aag gtg gec aag aag 245 
lie Thr Tyr Ser Leu Cys Gly Thr Lys Asn lie Lys Val Ala Lys Lys 
40 45 50 55 

gtg gtg aag acc cac gag ccg gee tec ttc aac etc aac gtc aca etc 293 
Val Val Lys Thr His Glu Pro Ala Ser Phe Asn Leu Asn Val Thr Leu 
35 60 65 70 

aag tec agt cca gac ctg etc acc tac ttc tgc egg gcg tec tec acc 341 
Lys Ser Ser Pro Asp Leu Leu Thr Tyr Phe Cys Arg Ala Ser Ser Thr 

75 80 85 

tea ggt gec cat gtg gac agt gee agg eta cag atg cac tgg gag ctg 3 89 

40 Ser Gly Ala His Val Asp Ser Ala Arg Leu Gin Met His Trp Glu Leu 
90 95 100 

tgg tec aag cca gtg tct gag ctg egg gec aac ttc act ctg cag gac 437 
Trp Ser Lys Pro Val Ser Glu Leu Arg Ala Asn Phe Thr Leu Gin Asp 
105 110 115 

45 aga ggg gca ggc ccc agg gtg gag atg ate tgc cag gcg tec teg ggc 4 85 

Arg Gly Ala Gly Pro Arg Val Glu Met lie Cys Gin Ala Ser Ser Gly 
120 125 130 135 

age cca cct ate acc aac age ctg ate ggg aag gat ggg cag gtc cac 533 
Ser Pro Pro He Thr Asn Ser Leu He Gly Lys Asp Gly Gin Val His 
50 140 145 150 

ctg cag cag aga cca tgc cac agg cag cct gee aac ttc tec ttc ctg 581 
Leu Gin Gin Arg Pro Cys His Arg Gin Pro Ala Asn Phe Ser Phe Leu 

155 160 165 

ccg age cag aca teg gac tgg ttc tgg tgc cag get gca aac aac gec 629 
55 Pro Ser Gin Thr Ser Asp Trp Phe Trp Cys Gin Ala Ala Asn Asn Ala 
170 175 180 

aat gtc cag cac age gec etc aca gtg gtg ccc cca gga ggg ttg ccc 677 
Asn Val Gin His Ser Ala Leu Thr Val Val Pro Pro Gly Gly Leu Pro 
185 190 195 

60 agg gca ccc acc ate gtg ctg gtt ggc age ctt gee tec act gcg gee 725 
Arg Ala Pro Thr He Val Leu Val Gly Ser Leu Ala Ser Thr Ala Ala 
200 205 210 215 

ate acc tec agg atg ctg ggc tgg acc acg tgg gec agg tgg 767 
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lie Thr Ser Arg Met Leu Gly Trp Thr Thr Trp Ala Arg Trp 

220 225 
tgaccagaag atggaggact ggcagggtcc cctggagagc cccatccttg ccttgccgct 
ctacaggagc acccgccgtc tgagtgaaga ggagtttggg gggttcagga tagggaatgg 
5 ggaggtcaga ggacgcaaag cagcagccat gtagaatgaa ccgtccagag agccaagcac 
ggcagaggac tgcaggccat cagcgtgcac tgttcgtatt tggagttcat gcaaaatgag 
tgtgttttag ctgctcttgc cacaaaaaaa aaaaaaaaaa 



827 
887 
947 
1007 
1047 



<210> 56 
10 <211> 925 
<212> DNA 

<213> Homo sapiens 

<220> 
15 <221> CDS 

<222> 4 . . 399 

<220> 

<221> sig_peptide 
20 <222> 4 . . 126 

<223> Von Heijne matrix 

score 4.34454795165846 
seq RWSWLFSIWFG/SI 



25 <400> 56 

acg atg gaa ggg ggt gcg tac gga gcg ggc aaa gcc ggg ggc gcc ttc 
Met Glu Gly Gly Ala Tyr Gly Ala Gly Lys Ala Gly Gly Ala Phe 
-40 -35 -30 

gac ccc tac acc ctg gtc egg cag ccg cac acc ate ctg cgc gtc gtg 
30 Asp Pro Tyr Thr Leu Val Arg Gin Pro His Thr lie Leu Arg Val Val 
-25 -20 -15 

tct tgg ctg ttc tec ata gtg gtg ttc ggc tec ate gtg aac gag ggc 
Ser Trp Leu Phe Ser lie Val Val Phe Gly Ser lie Val Asn Glu Gly 
-10 -5 15 

35 tac etc aac age gcc tec gag ggg gag cag ttc tgc ate tac aac cgc 
Tyr Leu Asn Ser Ala Ser Glu Gly Glu Gin Phe Cys lie Tyr Asn Arg 

10 15 20 

aac ccc aac gcc tgc age tat ggc gtg gcc gtg ggc gtg etc gcc ttc 
Asn Pro Asn Ala Cys Ser Tyr Gly Val Ala Val Gly Val Leu Ala Phe 
40 25 30 35 

etc acc tgc ctg ctg tac ctg gcc ctg gac gtg tac ttc ccg cag ate 
Leu Thr Cys Leu Leu Tyr Leu Ala Leu Asp Val Tyr Phe Pro Gin lie 

40 45 50 

age age gtc aag gac cgc aag aaa gcc gtc ctg tec gac ate ggt gtc 
45 Ser Ser Val Lys Asp Arg Lys Lys Ala Val Leu Ser Asp lie Gly Val 
55 60 65 70 

teg ggt gag ccc cac cca gca ggt acc ccc tgc aca gag tct aca gag 
Ser Gly Glu Pro His Pro Ala Gly Thr Pro Cys Thr Glu Ser Thr Glu 
75 80 85 

50 ggc tgt ccc ggg cca taggaggegg ctgccaccct tcttcccatg tttcagatga 
Gly Cys Pro Gly Pro 
90 

gggaaatgag ccttctgggc tttcctctgg ttcgtgggat tctgctacct ggccaaccag 
tggcaggtct ccaagcccaa ggacaaccca ctgaacgaag ggaeggaege agcccgggcc 

55 gccatcgcct tctccttttt ctccatcttc acctggagcc tgaccgcagc cctggccgtg 
eggagattea aggacctaag cttccaggag gagtacagca cactgttccc tgetteggea 
cagcegtagg cctccccggc ttgeagagge cggcagccct gtatcacccc tggcagtgag 
gtggcaggag cagectagtg ccagaaatgt ccaagatgcc agggcatgea gggcagtgga 
aggctggctt gaggaaccaa ttcaggttct ccactgactc attcattcct tcaccgcctc 

60 cttcattgat tettcatgeg ttcattcatt cagtaaacat ttattgagta aaaaaaaaaa 
aaaaaa 



48 



96 



144 



192 



240 



288 



336 



384 



439 



499 
559 
619 
679 
739 
799 
859 
919 
925 



<210> 57 



66 
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<211> 1240 

<212> DNA 

<213> Homo sapiens 

5 <220> 

<221> CDS 
<222> 127 . . 879 

<220> 

10 <221> sig_peptide 
<222> 127 . . 198 
<223> Von Heijne matrix 

score 5.38660866264012 

seq ALCSVCSMSVLRA/YP 

15 

<400> 57 

agtctaggat cctcacacca gctacttgca agggagaagg aaaaggccag taaggcctgg 60 
gccaggagag tcccgacagg agtgtcaggt ttcaatctca gcaccagcca ctcagagcag 120 

ggcacg atg ttg ggg gcc cgc etc agg etc tgg gtc tgt gee ttg tgc 168 

20 Met Leu Gly Ala Arg Leu Arg Leu Trp Val Cys Ala Leu Cys 

-20 -15 

age gtc tgc age atg age gtc etc aga gcc tat ccc aat gcc tec cca 216 

Ser Val Cys Ser Met Ser Val Leu Arg Ala Tyr Pro Asn Ala Ser Pro 

-10 -5 15 

25 ctg etc ggc tec age tgg ggt ggc ctg ate cac ctg tac aca gcc aca 264 

Leu Leu Gly Ser Ser Trp Gly Gly Leu lie His Leu Tyr Thr Ala Thr 

10 15 20 

gcc agg aac age tac cac ctg cag ate cac aag aat ggc cat gtg gat 312 

Ala Arg Asn Ser Tyr His Leu Gin lie His Lys Asn Gly His Val Asp 

30 25 30 35 

ggc gca ccc cat cag ace ate tac agt gcc ctg atg ate aga tea gag 360 

Gly Ala Pro His Gin Thr lie Tyr Ser Ala Leu Met lie Arg Ser Glu 

40 45 50 

gat get ggc ttt gtg gtg att aca ggt gtg atg age aga aga tac etc 408 

35 Asp Ala Gly Phe Val Val lie Thr Gly Val Met Ser Arg Arg Tyr Leu 

55 60 65 70 

tgc atg gat ttc aga ggc aac att ttt gga tea cac tat ttc gac ccg 456 

Cys Met Asp Phe Arg Gly Asn lie Phe Gly Ser His Tyr Phe Asp Pro 

75 80 85 

40 gag aac tgc agg ttc caa cac cag acg ctg gaa aac ggg tac gac gtc 504 

Glu Asn Cys Arg Phe Gin His Gin Thr Leu Glu Asn Gly Tyr Asp Val 

90 95 100 

tac cac tct cct cag tat cac ttc ctg gtc agt ctg ggc egg gcg aag 552 

Tyr His Ser Pro Gin Tyr His Phe Leu Val Ser Leu Gly Arg Ala Lys 

45 105 110 115 

aga gcc ttc ctg cca ggc atg aac cca ccc ccg tac tec cag ttc ctg 600 

Arg Ala Phe Leu Pro Gly Met Asn Pro Pro Pro Tyr Ser Gin Phe Leu 

120 125 130 

tec egg agg aac gag ate ccc eta att cac ttc aac acc ccc ata cca 648 

50 Ser Arg Arg Asn Glu lie Pro Leu lie His Phe Asn Thr Pro lie Pro 

135 140 145 150 

egg egg cac acc egg age gcc gag gac gac teg gag egg gac ccc ctg 696 

Arg Arg His Thr Arg Ser Ala Glu Asp Asp Ser Glu Arg Asp Pro Leu 

155 160 165 

55 aac gtg ctg aag ccc egg gcc egg atg acc ccg gcc ccg gcc tec tgt 744 

Asn Val Leu Lys Pro Arg Ala Arg Met Thr Pro Ala Pro Ala Ser Cys 

170 175 180 

tea cag gag etc ccg age gcc gag gac aac age ccg atg gcc agt gac 792 

Ser Gin Glu Leu Pro Ser Ala Glu Asp Asn Ser Pro Met Ala Ser Asp 

60 185 190 195 

cca tta ggg gtg gtc agg ggc ggt cga gtg aac acg cac get ggg gga 84 0 

Pro Leu Gly Val Val Arg Gly Gly Arg Val Asn Thr His Ala Gly Gly 

200 205 210 
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acg ggc ccg gaa ggc tgc cgc ccc ttc gcc aag ttc ate tagggtcget 889 

Thr Gly Pro Glu Gly Cys Arg Pro Phe Ala Lys Phe lie 
215 220 225 

ggaagggcac cctctttaac ccatccctca geaaaegcag ctcttcccaa ggaccaggtc 94 9 

5 ecttgaegtt ccgaggatgg gaaaggtgac aggggcatgt atggaatttg ctgcttctct 1009 

ggggtccctt ccacaggagg tcctgtgaga accaaccttt gaggeccaag tcatggggtt 1069 

tcaccgcctt cctcactcca tatagaacac ctttcccaat aggaaacccc aacaggtaaa 1129 

ctagaaattt ccccttcatg aaggtagaga gaaggggtct ctcccaacat atttctcttc 1189 

cttgtgcctc tcctctttat cacttttaag catgaaaaaa aaaaaaaaaa a 1240 



10 

<210> 58 

<211> 902 

<212> DNA 

<213> Homo sapiens 

15 

<220> 

<221> CDS 

<222> 156 . . 566 

20 <220> 

<221> sig_peptide 

<222> 156 . .221 

<223> Von Heijne matrix 

score 5.67458379966095 
25 seq LVSMAGRVCLCQG/SA 

<400> 58 

atttcccagc gtgectcagg aagggegeca ggactgeatt ttgctccgga gegtccagag 60 
tcctggccct gagegggaat cgcagtggcc gaggctgagc ggcaggcgga tcgccccgac 120 
30 cctcactcct ggegtctgag tctctggcgt agece atg ctg agt ggg egg ctg 173 

Met Leu Ser Gly Arg Leu 





























-20 












gtc 


ctg 


ggt 


ctg 


gtc 


tec 


atg 


get 


ggc 


cgc 


gtt 


tgt 


ttg 


tgc 


cag 


ggc 


221 




Val 


Leu 


Gly 


Leu 


Val 


Ser 


Met 


Ala 


Gly 


Arg 


Val 


Cys 


Leu 


Cys 


Gin 


Gly 




35 




-15 










-10 










-5 














age 


gcg 


gga 


tec 


ggg 


gcc 


ate 


ggt 


ccg 


gtg 


gag 


gcc 


gcc 


att 


cgc 


acg 


269 




Ser 


Ala 


Gly 


Ser 


Gly 


Ala 


He 


Gly 


Pro 


Val 


Glu 


Ala 


Ala 


He 


Arg 


Thr 






1 








5 










10 










15 








aag 


ttg 


gag 


gag 


gcc 


ctg 


age 


ccc 


gag 


gtg 


eta 


gag 


ctt 


cgc 


aac 


gag 


317 


40 


Lys 


Leu 


Glu 


Glu 


Ala 


Leu 


Ser 


Pro 


Glu 


Val 


Leu 


Glu 


Leu 


Arg 


Asn 


Glu 










20 










25 










30 










age 


ggt 


ggc 


cac 


gcg 


gtc 


ccg 


cca 


ggc 


agt 


gag 


act 


cac 


ttc 


cgc 


gtg 


365 




Ser 


Gly 


Gly 
35 


His 


Ala 


Val 


Pro 


Pro 
40 


Gly 


Ser 


Glu 


Thr 


His 
45 


Phe 


Arg 


Val 




45 


get 


gtg 


gtg 


age 


tct 


cgt 


ttc 


gag 


gga 


ctg 


age 


ccc 


eta 


caa 


cga 


cac 


413 




Ala 


Val 
50 


Val 


Ser 


Ser 


Arg 


Phe 
55 


Glu 


Gly 


Leu 


Ser 


Pro 
60 


Leu 


Gin 


Arg 


His 






egg 


ctg 


gtc 


cac 


gca 


gcg 


ctg 


gcc 


gag 


gag 


ctg 


gga 


ggt 


ccg 


gtc 


cat 


461 




Arg 


Leu 


Val 


His 


Ala 


Ala 


Leu 


Ala 


Glu 


Glu 


Leu 


Gly 


Gly 


Pro 


Val 


His 




50 


65 










70 










75 










80 






gcg 


ctg 


gcc 


ate 


cag 


gca 


egg 


ace 


ccc 


gcc 


cag 


tgg 


aga 


gag 


aac 


tct 


509 




Ala 


Leu 


Ala 


lie 


Gin 
85 


Ala 


Arg 


Thr 


Pro 


Ala 
90 


Gin 


Trp 


Arg 


Glu 


Asn 
95 


Ser 






cag 


ctg 


gac 


act 


age 


ccc 


cca 


tgc 


ctg 


ggt 


ggg 


aac 


aag 


aaa 


act 


eta 


557 


55 


Gin 


Leu 


Asp 


Thr 


Ser 


Pro 


Pro 


Cys 


Leu 


Gly 


Gly 


Asn 


Lys 


Lys 


Thr 


Leu 





100 105 HO 



gga acc ccc tgaaccccaa gagagggagg accaggatcc gaatgggctg 606 
Gly Thr Pro 
115 

60 ggtgagcacg aattaccgag gccttccctt tgatacagtc caggatttgt aagggatgaa 666 

gacccctggg ccccattctg ttggggtcca tacatactct ccgaagatag caacttgett 726 

caggtcaaag tgaacccgag aaaagagaag aatcactcac tactgetett gccctggact 786 

attcaggaag ggcagcccgg atgttccatg ttaaatcgtg acagaattgc accagacctg 846 
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atgagttgga aacaatccta tacattaaaa gaaattacac taaaaaaaaa aaaaaa 902 

<210> 59 
<211> 1969 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 35. .1657 

<220> 

<221> sig_peptide 
<222> 35 . . 118 
15 <223> Von Heijne matrix 

score 3.75144398608723 

seq SGLLLQVLFRLIT/FV 

<400> 59 

20 atttttcctg gtgtctgagc ctggcgcgga ggct atg ggc age cag gag gtg ctg 55 

Met Gly Ser Gin Glu Val Leu 
-25 

ggc cac gcg gec egg ctg gee tec tec ggt etc etc ctg cag gtg ttg 103 
Gly His Ala Ala Arg Leu Ala Ser Ser Gly Leu Leu Leu Gin Val Leu 
25 -20 -15 -10 

ttt egg ttg ate ace ttt gtc ttg aat gca ttt att ctt cgc ttc ctg 151 
Phe Arg Leu lie Thr Phe Val Leu Asn Ala Phe lie Leu Arg Phe Leu 
-5 1 5 10 

tea aag gaa ate gtt ggc gta gta aat gta aga eta acg ctg ctt tac 199 
30 Ser Lys Glu He Val Gly Val Val Asn Val Arg Leu Thr Leu Leu Tyr 
15 20 25 

tea acc ace etc ttc ctg gee aga gag gee ttc cgc aga gca tgt etc 247 
Ser Thr Thr Leu Phe Leu Ala Arg Glu Ala Phe Arg Arg Ala Cys Leu 
30 35 40 

35 agt ggg ggc acc cag cga gac tgg age cag acc etc aac ctg ctg tgg 295 
Ser Gly Gly Thr Gin Arg Asp Trp Ser Gin Thr Leu Asn Leu Leu Trp 

45 50 55 

eta aca gtc ccc ctg ggt gtg ttt tgg tec tta ttc ctg ggc tgg ate 343 
Leu Thr Val Pro Leu Gly Val Phe Trp Ser Leu Phe Leu Gly Trp He 
40 60 65 70 75 

tgg ttg cag ctg ctt gaa gtg cct gat cct aat gtt gtc cct cac tat 391 
Trp Leu Gin Leu Leu Glu Val Pro Asp Pro Asn Val Val Pro His Tyr 

80 85 90 

gca act gga gtg gtg ctg ttt ggt etc teg gca gtg gtg gag ctt eta 439 
45 Ala Thr Gly Val Val Leu Phe Gly Leu Ser Ala Val Val Glu Leu Leu 
95 100 105 

gga gag ccc ttt tgg gtc ttg gca caa gca cat atg ttt gtg aag etc 487 
Gly Glu Pro Phe Trp Val Leu Ala Gin Ala His Met Phe Val Lys Leu 
110 115 120 

50 aag gtg att gca gag age ctg teg gta att ctt aag age gtt ctg aca 535 
Lys Val He Ala Glu Ser Leu Ser Val He Leu Lys Ser Val Leu Thr 

125 130 135 

get ttt etc gtg ctg tgg ttg cct cac tgg gga ttg tac att ttc tct 583 
Ala Phe Leu Val Leu Trp Leu Pro His Trp Gly Leu Tyr He Phe Ser 
55 140 145 150 155 

ttg gee cag ctt ttc tat acc aca gtt ctg gtg etc tgc tat gtt att 631 
Leu Ala Gin Leu Phe Tyr Thr Thr Val Leu Val Leu Cys Tyr Val He 

160 165 170 

tat ttc aca aag tta ctg ggt tec cca gaa tea acc aag ctt caa act 679 
60 Tyr Phe Thr Lys Leu Leu Gly Ser Pro Glu Ser Thr Lys Leu Gin Thr 
175 180 185 

ctt cct gtc tec aga ata aca gat ctg tta ccc aat att aca aga aat 727 
Leu Pro Val Ser Arg He Thr Asp Leu Leu Pro Asn He Thr Arg Asn 
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190 195 200 

gga gcg ttt ata aac tgg aaa gag get aaa ctg act tgg agt ttt ttc 775 
Gly Ala Phe lie Asn Trp Lys Glu Ala Lys Leu Thr Trp Ser Phe Phe 
205 210 215 

5 aaa cag tct ttc ttg aaa cag att ttg aca gaa ggc gag cga tat gtg 823 
Lys Gin Ser Phe Leu Lys Gin lie Leu Thr Glu Gly Glu Arg Tyr Val 
220 225 230 235 

atg aca ttt ttg aat gta ttg aac ttt ggt gat cag ggt gtg tat gat 871 
Met Thr Phe Leu Asn Val Leu Asn Phe Gly Asp Gin Gly Val Tyr Asp 
10 240 245 250 

ata gtg aat aat ctt ggc tec ctt gtg gec aga tta att ttc cag cca 919 
lie Val Asn Asn Leu Gly Ser Leu Val Ala Arg Leu lie Phe Gin Pro 

255 260 265 

ata gag gaa agt ttt tat ata ttt ttt get aag gtg ctg gag agg gga 967 
15 lie Glu Glu Ser Phe Tyr lie Phe Phe Ala Lys Val Leu Glu Arg Gly 
270 275 280 

aag gat gec aca ctt cag aag cag gag gac gtt get gtg get get gca 1015 
Lys Asp Ala Thr Leu Gin Lys Gin Glu Asp Val Ala Val Ala Ala Ala 
285 290 295 

20 gtc ttg gag tec ctg etc aag ctg gee ctg ctg gee ggc ctg acc ate 1063 
Val Leu Glu Ser Leu Leu Lys Leu Ala Leu Leu Ala Gly Leu Thr lie 
300 305 310 315 

act gtt ttt ggc ttt gee tat tct cag ctg get ctg gat ate tac gga 1111 
Thr Val Phe Gly Phe Ala Tyr Ser Gin Leu Ala Leu Asp lie Tyr Gly 
25 320 325 330 

ggg acc atg ctt age tea gga tec ggt cct gtt ttg ctg cgt tec tac 1159 
Gly Thr Met Leu Ser Ser Gly Ser Gly Pro Val Leu Leu Arg Ser Tyr 

335 340 345 

tgt etc tat gtt etc ctg ctt gee ate aat gga gtg aca gag tgt tta 1207 
30 Cys Leu Tyr Val Leu Leu Leu Ala lie Asn Gly Val Thr Glu Cys Leu 
350 355 360 

aca ttt get gee atg age aaa gag gag gtc gac agg tac aat ttt gtg 1255 
Thr Phe Ala Ala Met Ser Lys Glu Glu Val Asp Arg Tyr Asn Phe Val 
365 370 375 

35 atg ctg gee ctg tec tec tea ttc ctg gtg tta tec tat etc ttg acc 1303 
Met Leu Ala Leu Ser Ser Ser Phe Leu Val Leu Ser Tyr Leu Leu Thr 
380 385 390 395 

cgt tgg tgt ggc age gtg ggc ttc ate ttg gec aac tgc ttt aac atg 1351 
Arg Trp Cys Gly Ser Val Gly Phe lie Leu Ala Asn Cys Phe Asn Met 
40 400 405 410 

ggc att egg ate acg cag age ctt tgc ttc ate cac cgc tac tac cga 1399 
Gly lie Arg lie Thr Gin Ser Leu Cys Phe lie His Arg Tyr Tyr Arg 

415 420 425 

agg age ccc cac agg ccc ctg get ggc ctg cac eta teg cca gtc ctg 1447 
45 Arg Ser Pro His Arg Pro Leu Ala Gly Leu His Leu Ser Pro Val Leu 
430 435 440 

etc ggg aca ttt gee etc agt ggt ggg gtt act get gtt teg gag gta 1495 
Leu Gly Thr Phe Ala Leu Ser Gly Gly Val Thr Ala Val Ser Glu Val 
445 450 455 

50 ttc etc tgc tgt gat cag ggc tgg cca gee aga ctg gca cac att get 1543 
Phe Leu Cys Cys Asp Gin Gly Trp Pro Ala Arg Leu Ala His lie Ala 
460 465 470 475 

gtg ggg gee ttc tgt ctg gga gca act etc ggg aca gca ttc etc aca 1591 
Val Gly Ala Phe Cys Leu Gly Ala Thr Leu Gly Thr Ala Phe Leu Thr 
55 480 485 490 

gag acc aag ctg ate cat ttc etc agg act cag tta ggt gtg ccc aga 1639 
Glu Thr Lys Leu lie His Phe Leu Arg Thr Gin Leu Gly Val Pro Arg 

495 500 505 

cgc act gac aaa atg aca tgacttcagg gaagcctgga cacccgaggc 1687 
60 Arg Thr Asp Lys Met Thr 
510 

acctggacca gctatgggta gttctgtggg tggaacacat tctgtgtaag agccccactg 1747 
agggctctgc agcggagtga cagcaacccc agagatgagg caccagagag tgccactgca 1807 
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tgagacacct gtgaccattc gaagtctgaa atgcgggggg ggagtttcat ttttaagtga 1867 
agaccaaaag ccctttaaaa ataatagttt tttatcattt tatagtaatc agcattttct 1927 
cttttactaa tatactcatt ccttttgaaa aaaaaaaaaa aa 1969 

5 <210> 60 
<211> 1132 
<212> DNA 

<213> Homo sapiens 

10 <220> 

<221> CDS 
<222> 77. .937 

<220> 

15 <221> sigj>eptide 
<222> 77. .127 
<223> Von Heijne matrix 

score 3.74817238048175 

seq RIVSAALLAFVQT/HL 

20 

<400> 60 

gttggtgggg ctgggggatg agagctgcac cgcgcgggac aagtcgccgg cggcccgacg 60 
gagcagaaga gagagc atg gag ctg gag agg ate gtc agt gca gec etc ctt 112 

Met Glu Leu Glu Arg lie Val Ser Ala Ala Leu Leu 
25 -15 ~ -10 

gec ttt gtc cag aca cac etc ccg gag gee gac etc agt ggc ttg gat 160 
Ala Phe Val Gin Thr His Leu Pro Glu Ala Asp Leu Ser Gly Leu Asp 
-5 15 10 

gag gtc ate ttc tec tat gtg ctt ggg gtc ctg gag gac ctg ggc ccc 208 
30 Glu Val lie Phe Ser Tyr Val Leu Gly Val Leu Glu Asp Leu Gly Pro 
15 20 25 

teg ggc cca tea gag gag aac ttc gat atg gag get ttc act gag atg 256 
Ser Gly Pro Ser Glu Glu Asn Phe Asp Met Glu Ala Phe Thr Glu Met 
30 35 40 

35 atg gag gee tat gtg cct ggc ttc gee cac ate ccc agg ggc aca ata 304 
Met Glu Ala Tyr Val Pro Gly Phe Ala His lie Pro Arg Gly Thr lie 

45 50 55 

ggg gac atg atg cag aag etc tea ggg cag ctg age gat gec agg aac 352 
Gly Asp Met Met Gin Lys Leu Ser Gly Gin Leu Ser Asp Ala Arg Asn 
40 60 65 70 75 

aaa gag aac ctg caa ccg cag age tct ggt gtc caa ggt cag gtg ccc 400 
Lys Glu Asn Leu Gin Pro Gin Ser Ser Gly Val Gin Gly Gin Val Pro 

80 85 90 

ate tec cca gag ccc ctg cag egg ccc gaa atg etc aaa gaa gag act 448 
45 lie Ser Pro Glu Pro Leu Gin Arg Pro Glu Met Leu Lys Glu Glu Thr 
95 100 105 

agg tct teg get get get get gca gac ace caa gat gag gca act ggc 496 
Arg Ser Ser Ala Ala Ala Ala Ala Asp Thr Gin Asp Glu Ala Thr Gly 
110 115 120 

50 get gag gag gag ctt ctg cca ggg gtg gat gta etc ctg gag gtg ttc 544 
Ala Glu Glu Glu Leu Leu Pro Gly Val Asp Val Leu Leu Glu Val Phe 

125 130 135 

cct ace tgt teg gtg gag cag gec cag tgg gtg ctg gee aaa get egg 592 
Pro Thr Cys Ser Val Glu Gin Ala Gin Trp Val Leu Ala Lys Ala Arg 
55 140 145 150 155 

ggg gac ttg gaa gaa get gtg cag atg ctg gta gag gga aag gaa gag 640 
Gly Asp Leu Glu Glu Ala Val Gin Met Leu Val Glu Gly Lys Glu Glu 

160 165 170 

ggg cct gca gee tgg gag ggc ccc aac cag gac ctg ccc aga cgc etc 688 
60 Gly Pro Ala Ala Trp Glu Gly Pro Asn Gin Asp Leu Pro Arg Arg Leu 
175 180 185 

a 9 a gg c ccc caa aag gat gag ctg aag tec ttc ate ctg cag aag tac 736 
Arg Gly Pro Gin Lys Asp Glu Leu Lys Ser Phe lie Leu Gin Lys Tyr 
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10 



20 



25 







190 










195 










200 








atg 


atg 


gtg 


gat 


age 


gca 


gag 


gat 


cag 


aag 


att 


cac 


egg 


ccc 


atg 


get 


Met 


Met 


Val 


Asp 


Ser 


Ala 


Glu 


Asp 


Gin 


Lys 


He 


His 


Arg 


Pro 


Met 


Ala 




205 










210 










215 










ccc 


aag 


gag 


gec 


ccc 


aag 


aag 


ctg 


ate 


cga 


tac 


ate 


gac 


aac 


cag 


gta 


Pro 


Lys 


Glu 


Ala 


Pro 


Lys 


Lys 


Leu 


He 


Arg 


Tyr 


He 


Asp 


Asn 


Gin 


Val 


220 










225 










230 










235 


gtg 


age 


acc 


aaa 


999 


gag 


cga 


ttc 


aaa 


gat 


gtg 


egg 


aac 


cct 


gag 


gee 


Val 


Ser 


Thr 


Lys 


Gly 


Glu 


Arg 


Phe 


Lys 


Asp 


Val 


Arg 


Asn 


Pro 


Glu 


Ala 










240 










245 










250 




gag 


gag 


atg 


aag 


gec 


aca 


tac 


ate 


aac 


etc 


aag 


cca 


gec 


aga 


aag 


tac 


Glu 


Glu 


Met 


Lys 


Ala 


Thr 


Tyr 


He 


Asn 


Leu 


Lys 


Pro 


Ala 


Arg 


Lys 


Tyr 








255 










260 










265 







<210> 61 
<211> 631 
<212> DNA 

<213 > Homo sapiens 

<220> 
<221> CDS 
<222> 9. .503 



784 



832 



880 



928 



cgc ttc cat tgaggcactc gccggactct gcccgagcct tctaggctca 977 
15 Arg Phe His 
270 

gatcccagag ggatgcagga gccctatacc cctacacagg ggccccctaa ctcctgtccc 1037 

ccttctctac tcctttgctc catagtgtta acctactctc ggagctgect ccatgggcac 1097. 

agtaaaggtg geccaaggaa aaaaaaaaaa aaaaa 1132 



<220> 

<221> sig_ peptide 
<222> 9. .113 
<223> Von Heijne matrix 

score 10.2506494380376 
seq LLPLVLLPPLAAA/AA 

<400> 61 

tgccaggg atg atg cgc tgc tgc cgc cgc cgc tgc tgc tgc egg caa cca 50 

Met Met Arg Cys Cys Arg Arg Arg Cys Cys Cys Arg Gin Pro 

40 -35 ^ -30 -25 





ccc 


cat 


gee 


ctg 


agg 


ccg 


ttg 


ctg 


ttg 


ctg 


ccc 


etc 


gtc 


ctt 


tta 


cct 


98 




Pro 


His 
-20 


Ala 


Leu 


Arg 


Pro 


Leu 
-15 


Leu 


Leu 


Leu 


Pro 


Leu 
-10 


val 


Leu 


Leu 


Pro 






ccc 


ctg 


gca 


gca 


get 


gca 


gcg 


ggc 


cca 


aac 


cga 


tgt 


gac 


acc 


ata 


tac 


146 


45 


Pro 
-5 


Leu 


Ala 


Ala 


Ala 


Ala 
1 


Ala 


Gly 


Pro 


Asn 
5 


Arg 


Cys 


Asp 


Thr 


He 
10 


Tyr 






cag 


ggc 


ttc 


gee 


gag 


tgt 


etc 


ate 


cgc 


ttg 


ggg 


gac 


age 


atg 


ggc 


cgc 


194 




Gin 


Gly 


Phe 


Ala 
15 


Glu 


Cys 


Leu 


He 


Arg 
20 


Leu 


Gly 


Asp 


Ser 


Met 
25 


Gly 


Arg 




50 


gga 


ggc 


gag 


ctg 


gag 


acc 


ate 


tgc 


agg 


tct 


tgg 


aat 


tac 


ttc 


cat 


gee 


242 




Gly 


Gly 


Glu 
30 


Leu 


Glu 


Thr 


He 


Cys 
35 


Arg 


Ser 


Trp 


Asn 


Tyr 
40 


Phe 


His 


Ala 






tgt 


gec 


tct 


cag 


gtc 


ctg 


tea 


ggc 


tgt 


ccg 


gag 


gag 


gca 


get 


gca 


gtg 


290 




Cys 


Ala 


Ser 


Gin 


Val 


Leu 


Ser 


Gly 


Cys 


Pro 


Glu 


Glu 


Ala 


Ala 


Ala 


Val 




55 




45 










50 










55 














tgg 


gaa 


tea 


eta 


cag 


caa 


gaa 


get 


cgc 


cag 


gec 


ccc 


cgt 


ccg 


aat 


aac 


338 




Trp 


Glu 


Ser 


Leu 


Gin 


Gin 


Glu 


Ala 


Arg 


Gin 


Ala 


Pro 


Arg 


Pro 


Asn 


Asn 






60 










65 










70 










75 






ttg 


cac 


act 


ctg 


tgc 


ggt 


gec 


ccg 


gtg 


cat 


gtt 


egg 


gag 


cgc 


ggc 


aca 


386 


60 


Leu 


His 


Thr 


Leu 


Cys 
80 


Gly 


Ala 


Pro 


Val 


His 
85 


Val 


Arg 


Glu 


Arg 


Gly 
90 


Thr 






ggc 


tec 


gaa 


acc 


aac 


cag 


gag 


acg 


ctg 


egg 


get 


aca 


gcg 


cct 


gca 


etc 


434 




Gly 


Ser 


Glu 


Thr 


Asn 


Gin 


Glu 


Thr 


Leu 


Arg 


Ala 


Thr 


Ala 


Pro 


Ala 


Leu 





72 



30 



35 
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10 



15 



95 100 105 

ccc atg gcc cct gcg ccc cca ctg ctg gcg get get ctg get ctg gee 482 
Pro Met Ala Pro Ala Pro Pro Leu Leu Ala Ala Ala Leu Ala Leu Ala 

110 115 120 

tac etc ctg agg cct ctg gcc tagcttgttg ggttgggtag cagcgcccgt 533 
Tyr Leu Leu Arg Pro Leu Ala 
125 130 
acctccagcc ctgctctggc ggtggttgtc caggctctgc agagegcage agggcttttc 593 
attaaaggta tttatatttg caaaaaaaaa aaaaaaaa 631 

<210> 62 

<211> 722 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 21. .464 

20 <220> 

<221> sig_peptide 

<222> 21. .95 

<223> Von Heijne matrix. 

score 5.38058532480537 
25 seq AVTSLLSPTPATA/LA 

<400> 62 

ggaagtgagt gatcgaaagc atg gcg teg gtg gtg ttg gcg ctg agg ace egg 53 

Met Ala Ser Val Val Leu Ala Leu Arg Thr Arg 
30 -25 -20 -15 

aca gcc gtt aca tec ttg eta age ccc act ccg get aca get ctt get 101 
Thr Ala Val Thr Ser Leu Leu Ser Pro Thr Pro Ala Thr Ala Leu Ala 

-10 -5 1 

gtc aga tac gca tec aag aag teg ggt ggt age tec aaa aac etc ggt 149 
35 Val Arg Tyr Ala Ser Lys Lys Ser Gly Gly Ser Ser Lys Asn Leu Gly 
5 10 15 

gga aag tea tea ggc aga cgc caa ggc att aag aaa atg gaa ggt cac 197 
Gly Lys Ser Ser Gly Arg Arg Gin Gly lie Lys Lys Met Glu Gly His 
20 25 30 

40 tat gtt cat get ggg aac ate att gca aca cag cgc cat ttc cgc tgg 245 
Tyr Val His Ala Gly Asn lie lie Ala Thr Gin Arg His Phe Arg Trp 
35 40 45 50 

cac cca ggt gcc cat gtg ggt gtt ggg aag aat aaa tgt ctg tat gcc 293 
His Pro Gly Ala His Val Gly Val Gly Lys Asn Lys Cys Leu Tyr Ala 
45 55 6 0 65 

ctg gaa gag ggg ata gtc cgc tac act aag gag gtc tac gtg cct cat 341 
Leu Glu Glu Gly lie Val Arg Tyr Thr Lys Glu Val Tyr Val Pro His 

70 75 80 

ccc aga aac acg gag get gtg gat ctg ate acc agg ctg ccc aag ggt 389 
50 Pro Arg Asn Thr Glu Ala Val Asp Leu lie Thr Arg Leu Pro Lys Gly 
85 90 95 

get gtg etc tac aag act ttt gtc cac gtg gtt cct gcc aag cct gag 437 
Ala Val Leu Tyr Lys Thr Phe Val His Val Val Pro Ala Lys Pro Glu 
100 105 110 

55 ggc acc ttc aaa ctg gta get atg ctt tgatgtcctg ttgaggecat 4 84 

Gly Thr Phe Lys Leu Val Ala Met Leu 
115 120 

eggacagaga ctggagccca ggtgacagga gatggtgata ccagaagtca agggttgggg 544 
tggcgacacg gcctcccgag gaagaggtct gcttgatggt gaetctgeag gagactctga 604 
60 agtgactgct gggaaaccct ttgggagacc tgacctgggg ccaaaaataa agtgagccag 664 
cgtcatgaac gcatgetatt tagggacaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 722 

<210> 63 

73 



BNSDOCID: <WO 0142451 A2J_> 



WO 01/42451 PCT/reOO/01938 

<211> 1442 

<212> DNA 

<213> Homo sapiens 

5 <220> 

<221> CDS 
<222> 178 . . 1050 

<220> 

10 <221> sig_jpeptide 
<222> 178 . .279 
<223> Von Heijne matrix 

score 10.0571391689271 

seq FLCLLSALLLTEG/KK 

15 

<400> 63 

agtgcattgc tggagcgagg agaagctcac gaatcagctg caggtctctg ttttgaaaaa 60 
gcagagatac agaggcagag gaaaagggca ctcctatgtg acctgttctt agagcaagac 120 
aatcaccatc tgaattccag aagccctgtt catggttggg gatattttct cgactgc 177 

20 atg gaa tea gaa aga age aaa agg atg gga aat gee tgc att ccc ctg 225 
Met Glu Ser Glu Arg Ser Lys Arg Met Gly Asn Ala Cys lie Pro Leu 

-30 -25 -20 

aaa aga att get tat ttc eta tgt etc tta tct gcg ett ttg ctg act 273 
Lys Arg lie Ala Tyr Phe Leu Cys Leu Leu Ser Ala Leu Leu Leu Thr 

25 -15 -10 -5 

9 a 9 999 aa 9 aaa cca 9 C 9 aa 9 cca aaa t 9C cct gee gtg tgt act tgt 321 
Glu Gly Lys Lys Pro Ala Lys Pro Lys Cys Pro Ala Val Cys Thr Cys 

1 5 10 

acc aaa gat aat get tta tgt gag aat gee aga tec att cca cgc acc 369 

30 Thr Lys Asp Asn Ala Leu Cys Glu Asn Ala Arg Ser lie Pro Arg Thr 
15 20 25 30 

gtt cct cct gat gtt ate tea tta tec ttt gtg aga tct ggt ttt act 417 
Val Pro Pro Asp Val lie Ser Leu Ser Phe Val Arg Ser Gly Phe Thr 
35 40 45 

35 gaa ate tea gaa ggg agt ttt tta ttc acg cca teg ctg cag etc ttg 465 
Glu lie Ser Glu Gly Ser Phe Leu Phe Thr Pro Ser Leu Gin Leu Leu 

50 55 60 

tta ttc aca teg aac tec ttt gat gtg ate agt gat gat get ttt att 513 
Leu Phe Thr Ser Asn Ser Phe Asp Val lie Ser Asp Asp Ala Phe lie 

40 65 70 75 

ggt ett cca cat eta gag tat tta ttc ata gaa aac aac aac ate aag 561 
Gly Leu Pro His Leu Glu Tyr Leu Phe lie Glu Asn Asn Asn lie Lys 

80 85 90 

tea att tea aga cat act ttc egg gga eta aag tea tta att cac ttg 609 

45 Ser lie Ser Arg His Thr Phe Arg Gly Leu Lys Ser Leu lie His Leu 
95 100 105 110 

age ett gca aac aac aat etc cag aca etc cca aaa gat att ttc aaa 657 
Ser Leu Ala Asn Asn Asn Leu Gin Thr Leu Pro Lys Asp lie Phe Lys 
115 120 125 

50 ggc ctg gat tct tta aca aat gtg gac ctg agg ggt aat tea ttt aat 705 
Gly Leu Asp Ser Leu Thr Asn Val Asp Leu Arg Gly Asn Ser Phe Asn 

130 135 140 

tgt gac tgt aaa ctg aaa tgg eta gtg gaa tgg ett ggc cac acc aat 753 
Cys Asp Cys Lys Leu Lys Trp Leu Val Glu Trp Leu Gly His Thr Asn 

55 145 " 150 155 

gca act gtt gaa gac ate tac tgc gaa ggc ccc cca gaa tac aag aag 801 
Ala Thr Val Glu Asp lie Tyr Cys Glu Gly Pro Pro Glu Tyr Lys Lys 

160 165 170 

cgc aaa ate aat agt etc tec teg aag gat ttc gat tgc ate att aca 849 

60 Arg Lys He Asn Ser Leu Ser Ser Lys Asp Phe Asp Cys He He Thr 
175 180 185 190 

gaa ttt gca aag tct caa gac ctg cct tat caa tea ttg tec ata gac 897 
Glu Phe Ala Lys Ser Gin Asp Leu Pro Tyr Gin Ser Leu Ser He Asp 

74 
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10 



15 



20 



25 



195 200 205 

act ttt tct tat ttg aat gat gag tat gta gtc ate get cag cct ttt 945 
Thr Phe Ser Tyr Leu Asn Asp Glu Tyr Val Val lie Ala Gin Pro Phe 

210 215 220 

act gga aaa tgc att ttc ctt gaa tgg gac cat gtg gaa aag acc ttc 993 
Thr Gly Lys Cys lie Phe Leu Glu Trp Asp His Val Glu Lys Thr Phe 

225 230 235 

egg aat tat gac aac att aca gtt tta agg gaa ata cac aga ttt aca 1041 
Arg Asn Tyr Asp Asn lie Thr Val Leu Arg Glu lie His Arg Phe Thr 

240 245 250 

aac atg tea tagttgactt aagcgcatga gacaccaaat tctgtggctg 1090 
Asn Met Ser 
255 

ccatcagaaa ttttctacag tacatgaccc ggatgaactc aatgeatgat gactcttctt 1150 
atcacacttg caaatgaatg cctttcaaac attgagactg ctagaaccaa gcactaccag 1210 
tatctccatc cttaactgtc cagtccagtg atgtgggaag ttacctttta taagacaaaa 1270 
tttaattgtg taactgttct ttgcagtgaa gatgtgtaaa taagcgttta atggtatctg 1330 
ttactccaaa aagaaatatt aatatgtact tttccattta tttattcatg tgtacagaaa 1390 
caactgccaa ataaaatgtt tacattttct ttcagaaaaa aaaaaaaaaa aa 1442 

<210> 64 
<211> 795 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 32 . . 



274 



30 <220> 

<221> sig_peptide 

<222> 32 . . 178 

<223> Von Heijne matrix 

score 4.30837886795471 
35 seq LMVELLKVFWEA/AV 



<400> 64 

gttccggtgg gcgcgcgttg aggctgeggt c atg gag gga gca gga get gga 

Met Glu Gly Ala Gly Ala Gly 

40 -45 

tec ggc ttc egg aag gag ctg gtg age agg ctg ctg cac ctg cac ttc 
Ser Gly Phe Arg Lys Glu Leu Val Ser Arg Leu Leu His Leu His Phe 

-40 -35 -30 

aag gat gac aag acc aaa gtg age ggg gac gcg ctg cag etc atg gtg 

45 Lys Asp Asp Lys Thr Lys Val Ser Gly Asp Ala Leu Gin Leu Met Val 
-25 -20 -15 

gag ttg ctg aag gtc ttc gtt gtg gaa gca gca gtc cgc ggc gtg egg 
Glu Leu Leu Lys Val Phe Val Val Glu Ala Ala Val Arg Gly Val Arg 
-10 -5 15 

50 cag gee cag gca gaa gac gcg etc cgt gtg gac gtg gac cag ctg gag 
Gin Ala Gin Ala Glu Asp Ala Leu Arg Val Asp Val Asp Gin Leu Glu 

10 15 20 

aag gtg ctt ccg cag ctg etc ctg gac ttc tagggatctc agccgtggct 
Lys Val Leu Pro Gin Leu Leu Leu Asp Phe 

55 25 30 

gaggccaccc ccagaggagc ccctggtcca cagaagcagg ccttgtgttt ccagcggcct 
ctgataagag gcagggaagg acctgaagga tttggagttg attcaaacaa gatctctggg 
agtctccagc ctgtgcagaa ggggcaggac tgcagtgcac tgegggcett ggagtgtcca 
gtggggacac tggtgtggga aggggcagca cctggggagt ccctgcctct cctccctggg 

60 acaatagtgt gcatgccacc eggggtccta caggcaggtg ctgggaaagg cctggccagc 
aggtagectg tgtgtttgac aaacagcagc tggcageget gcctcctgcc cacattcctg 
ccacccgaca tcaaagctgg cgtgtgacct ttccagccat gegatattec ccttggaaga 
tgcttcccca ggctataaat ttgttctcac aaagcaacat caataaatca aaactgtctc 



52 



100 



148 



196 



244 



294 



354 
414 
474 
534 
594 
654 
714 
774 



75 
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<210> 65 
<211> 1236 
<212> DNA 

<213> Homo sapiens 



10 



<220> 
<221> CDS 
<222> 222 . . 920 



<220> 

<221> sig_peptide 

<222> 222 . .311 
15 <223> Von Heijne matrix 

score 4.35083245061594 
seq VAHALSLPAESYG/ND 



20 



25 



30 



35 



40 



<400> 65 

accgaaaatt actgacgagt caatcacctc agatctctca 
agtactccac ctctgcgcct gtgcggggag ggtaaggcgg 
ggagggagag cgcacggtgg agccgccagt tgagaaggac 
caatcagctg cggaaggagc cacgctttcg ggggttgcaa 



gga act 
Gly Thr 
-25 

ctt tct 
Leu Ser 

get tgg 
Ala Trp 

att tea 
lie Ser 

25 
caa att 
Gin He 
40 

gtg ttg 
Val Leu 



50 



55 



gat gag ccg 
Asp Glu Pro 

etc cca gca 
Leu Pro Ala 
-5 

gee atg aga 
Ala Met Arg 
10 

tea gtt gac 
Ser Val Asp 

tac tct gag 
Tyr Ser Glu 



agg cca 
45 Arg Pro 

ggt act 
Gly Thr 



ace ate 
Thr He 
105 
aac egg 
Asn Arg 
120 

gaa gga 
Glu Gly 



gga gaa 
60 Gly Glu 

gga gaa 
Gly Glu 



gac 
Asp 

ttc 
Phe 

ttg 

Leu 

90 

ttt 

Phe 



cca gaa 
Pro Glu 

60 
tgc ttg 
Cys Leu 
75 

ctg cga 
Leu Arg 

gee ccc 
Ala Pro 



gaa ggc tat 
Glu Gly Tyr 

gag aaa gga 
Glu Lys Gly 
140 

gaa gag aac 
Glu Glu Asn 
155 

gaa aaa gag 
Glu Lys Glu 



gtt 
Val 
-20 
gag 
Glu 

gca 
Ala 

cca 
Pro 

ttc 

Phe 

45 

gaa 

Glu 

aag 
Lys 

eta 
Leu 

agg 
Arg 

aac 
Asn 
125 
gtc 
Val 

acc 
Thr 

gaa 
Glu 



tec 
Ser 

teg 
Ser 

atg 
Met 

cag 

Gin 

30 

egg 

Arg 

etc 
Leu 

ttt 
Phe 

gat 
Asp 

ata 
He 
110 
aaa 
Lys 

aac 
Asn 

aag 
Lys 

gga 
Gly 



ggg gag ttg gtg 
Gly Glu Leu Val 
-15 

tat ggc aac gat 
Tyr Gly Asn Asp 
1 

cag cat get gaa 
Gin His Ala Glu 
15 

ttc etc aaa etc 
Phe Leu Lys Leu 



aaa aat 
Lys Asn 

aag tea 
Lys Ser 

aat ggg 
Asn Gly 
80 

tgt tct 
Cys Ser 
95 

caa ttc 
Gin Phe 



get 
Ala 

aat 
Asn 

aat 
Asn 

ate 
He 



gtt 
Val 

gga 
Gly 

gga 
Gly 
160 
aac 
Asn 



ttt gag 
Phe Glu 

50 
gaa tea 
Glu Ser 
65 

att gtt 
lie Val 

cag ggc 
Gin Gly 

ttt gee 
Phe Ala 

tat ate 
Tyr He 
13 0 
gga gaa 
Gly Glu 
145 

gga gag 
Gly Glu 

aga gaa 
Arg Glu 



agcagtccag cctacgcaac 60 
ggccagcaac ttcctcagct 120 
tetgatcegg ctcagctttc 180 
9 at 9 9 c g gee acc agt 236 

Met Ala Ala Thr Ser 

-30 

tct gtg gca cat gcg 284 
Ser Val Ala His Ala 
-10 

cct gac att gag atg 332 
Pro Asp He Glu Met 
5 

gtc tat tac aag ctg 380 
Val Tyr Tyr Lys Leu 
20 

acc aaa gta gat gac 428 

Thr Lys Val Asp Asp 

35 

acc ctt agg ata gat 476 
Thr Leu Arg He Asp 
55 

gee aaa gag aag tgg 524 
Ala Lys Glu Lys Trp 
70 

gaa gac ttc aac tat 572 
Glu Asp Phe Asn Tyr 
85 

tac act gag gaa aac 620 
Tyr Thr Glu Glu Asn 
100 

att gaa att get egg 668 

He Glu He Ala Arg 

115 

agt gtt cag gac aaa 716 
Ser Val Gin Asp Lys 
135 

aaa aga get gac agt 764 
Lys Arg Ala Asp Ser 
150 

aaa gga get gat agt 812 
Lys Gly Ala Asp Ser 
165 

gac aaa act gac aaa 860 
Asp Lys Thr Asp Lys 



76 
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10 



170 175 180 

gga gga gaa aaa ggg aaa gaa get gac aaa gaa ate aac aaa agt ggt 
Gly Gly Glu Lys Gly Lys Glu Ala Asp Lys Glu lie Asn Lys Ser Gly 

185 190 195 

gaa aaa get atg taaggtatac agggaacagc actctagaag ctatgactca 
Glu Lys Ala Met 
200 

attgagacta caagtaccac ggtgctactt gcacagaccc ctttggttaa atgtaaattc 
ttgtacaatt gaaggatacg cagaaggaca tctttctagt ctaacagtca ggagctgetc 
tggtcattcc cttgtatgaa ctggtctaaa gactgttagt ggggtgttag ttgatttttc 
ctggtatact gtttcttggc tgacactact ggtcaagtaa gaaatttgta aataaatttc 
ttttggttct tattaaaaca aaaaaaaaaa aaaaaa 



908 



960 



1020 
1080 
1140 
1200 
1236 



<210> 66 
15 <211> 881 
<212> DNA 

<213> Homo sapiens 

<220> 
20 <221> CDS 

<222> 101. .355 

<220> 

<221> sig_peptide 
25 <222> 101 . . 160 

<223> Von Heijne matrix 

score 9.32665652007071 

seq LFLCYLLLFTCSG/VE 

30 <400> 66 

ttactcgctg ctgtgcccat ctatcagcag gctccgggct 
cctccaaggt etagtgaegg agcccgcgcg cggcgccacc 



35 gta teg ctt ttc ttg tgc tac ctg ctg etc ttc 
Val Ser Leu Phe Leu Cys Tyr Leu Leu Leu Phe 
-15 -10 -5 

gag gca ggt aag aaa aag tgc teg gag age teg 
Glu Ala Gly Lys Lys Lys Cys Ser Glu Ser Ser 

40 5 10 

ttc tgg aag gee ctg ace ttc atg gee gtc gga 
Phe Trp Lys Ala Leu Thr Phe Met Ala Val Gly 

20 25 
gee ggg ctg ccc gcg ctg ggc ttc ace ggc gee 

45 Ala Gly Leu Pro Ala Leu Gly Phe Thr Gly Ala 
35 40 
teg gtg get gee teg ctg atg age tgg tct gcg 
Ser Val Ala Ala Ser Leu Met Ser Trp Ser Ala 
50 55 60 

50 tagtggccac getgeagage cteggggctg gtggcagcag 
gtgecctgat gggctacgcc acccacaagt atctcgatag 
cagcagctcc cagaacctct tcttccttct tggectaact 
etttgecttt tttttttttt tttttttttt ttgagatggg 
etagagtgea gkggctattc acagatgega acatagtaca 

55 cctcaagtga tcctcctgtc tcaacctccc aagtaggatt 
cccaraatcc araactttgt ctatcactct ccccaacaac 
aacttcaccc agaaaaaaaa aaammaeaar aaaaaaaaaa 
aaaaaaaaaa rrraaaaaaa aaaaaaaaga aaaaaaaaaa 

60 <210> 67 
<211> 524 
<212> DNA 

<213> Homo sapiens 

77 



gaagattget tctcttctct 
atg egg cag aag gcg 
Met Arg Gin Lys Ala 
-20 

act tgc agt ggg gtg 
Thr Cys Ser Gly Val 
1 

gac age ggc tec ggg 
Asp Ser Gly Ser Gly 
15 

gg a gg a ctc g ca g tc 

Gly Gly Leu Ala Val 
30 

ggc ate gcg gee aac 
Gly lie Ala Ala Asn 
45 

ate ctg aat ggg ggc 
lie Leu Asn Gly Gly 
65 

egtegtcata ggtaatattg 
tgaggaggat gaggagtagc 
cttccagtta ggatctagaa 
ttctcactat attgtccagg 
ctgcagcctc caactcctag 
acaagcatgc gecgacgatg 
ctagatgtga aaacagaata 
aaaaaaaaaa aaaaaaaaam 
aaaaaa 



60 
115 



163 



211 



259 



307 



355 



415 
475 
535 
595 
655 
715 
775 
835 
881 
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<220> 
<221> CDS 
<222> 173 . .487 

5 

<220> 

<221> sig_peptide 
<222> 173 . .301 
<223> Von Heijne matrix 
10 score 4.274 84469223909 

seq AGSLVATLQS VGA/ AG 

<400> 67 

agggcagagt aggcgcgtcc ctactggatg gagggggaag taacacccca agaacgctgt 60 
15 catttcctgg gccaagttgg gacccggacg gcctcaccat gatgaaacgg gcagctgctg 120 
ctgcagtggg aggagccctg gcagtggggg ctgtgccgtg gtgctcagtg cc atg ggc 178 

Met Gly 

ttc act ggg gca gga ate gec gcg tec tec ata gca gec aag atg atg 226 
Phe Thr Gly Ala Gly lie Ala Ala Ser Ser lie Ala Ala Lys Met Met 

20 -40 -35 -30 

tec gca gca gec att gee aac ggg ggt ggt gtt tct gcg ggg age ctg 274 

Ser Ala Ala Ala lie Ala Asn Gly Gly Gly Val Ser Ala Gly Ser Leu 

-25 -20 -15 -10 

gtg get act ctg cag tec gtg ggg gca get gga etc tec aca tea tec 322 

25 Val Ala Thr Leu Gin Ser Val Gly Ala Ala Gly Leu Ser Thr Ser Ser 
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<210> 68 
<211> 1472 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 210 . . 1082 

50 <220> 

<221> sig_peptide 

<222> 210 . .311 

<223> Von Heijne matrix 

score 10.0571391689271 
55 seq FLCLLSALLLTEG/KK 

<400> 68 

acagtacctc acaggtctct tcccccgagc agtgcattgc tggagegagg agaagctcac 60 

gaatcagctg caggtctctg ttttgaaaaa gcagagatac agaggcagag gaaaagggtg 12 0 

60 gactcctatg tgacctgttc ttagagcaag acaatcacca tctgaattcc agaagecctg 180 

ttcatggttg gggatatttt ctcgactgc atg gaa tea gaa aga age aaa agg 233 

Met Glu Ser Glu Arg Ser Lys Arg 
-30 
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Met Gly Asn 
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215 
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gec tgc att ccc ctg aaa aga att 
Ala Cys lie Pro Leu Lys Arg lie 

-20 

gcg ctt ttg ctg act gag ggg aag 
Ala Leu Leu Leu Thr Glu Gly Lys 

-5 1 
gec gtg tgt act tgt acc aaa gat 
Ala Val Cys Thr Cys Thr Lys Asp 
10 15 
tec att cca cgc acc gtt cct cct 
Ser lie Pro Arg Thr Val Pro Pro 
30 

aga tct gtt ttt act gaa ate tea 
Arg Ser Val Phe Thr Glu lie Ser 
45 

teg ctg cag etc ttg tta ttc aca 
Ser Leu Gin Leu Leu Leu Phe Thr 

60 65 
gat gat get ttt att ggt ctt cca 
Asp Asp Ala Phe He Gly Leu Pro 

75 80 
aac aac aac ate aag tea att tea 
Asn Asn Asn He Lys Ser He Ser 
90 95 
aag tea tta att cac ttg age ctt gca 
Lys Ser Leu He His Leu Ser Leu Ala 
105 110 
cca aaa gat att ttc aaa ggc ctg gat 
Pro Lys Asp He Phe Lys Gly Leu Asp 
125 
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140 145 
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155 160 
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170 175 
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ata cac aga ttt aca aac atg tea 
He His Arg Phe Thr Asn Met Ser 
250 255 
tctgtggctg ccatcagaaa ttttctacag 
gactcttctt atcacacttg caaatgaatg 
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tgtacagaaa caactgccaa ataaaatgtt 
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<211> 1737 

<212> DNA 

<213> Homo sapiens 

5 <220> 

<221> CDS 
<222> 172 . . 1449 

<220> 

10 <221> sig_peptide 
<222> 172 . . 255 
<223> Von Heijne matrix 

score 5.94825670923113 

seq X VLLE P F VHQVGG / HS 

15 

<400> 69 

aaacaatagg acggaaacgc cgaggaaccc ggctgaggcg gcagagcatc ctggccagaa 60 
caagccaagg agccaagacg agagggacac acggacaaac aacagacaga agacgtactg 120 
gccgctggac tccgctgcct cccccatctc cccgccatct gcgcccggag g atg age 177 

20 Met Ser 

cca gec ttc agg gec atg gat gtg gag ccc cgc gec aaa ggc gtc ctt 225 
Pro Ala Phe Arg Ala Met Asp Val Glu Pro Arg Ala Lys Gly Val Leu 

-25 -20 -15 

ctg gag ccc ttt gtc cac cag gtc ggg ggg cac tea tgc gtg etc cgc 273 

25 Leu Glu Pro Phe Val His Gin Val Gly Gly His Ser Cys Val Leu Arg 
-10 -5 1 5 
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Gin His Gly Asp Asp Ala Ser Glu Glu Lys Ala Ala Asn Gin lie Arg 

200 205 210 

aaa tgt cag cag age aca tct gca gtc att ggt gtg cgt gtg tgt ggc 945 
Lys Cys Gin Gin Ser Thr Ser Ala Val lie Gly Val Arg Val Cys Gly 
5 215 220 225 230 

atg cag gtg tac caa gca ggc agt ggg cag etc atg ttc atg aac aag 993 
Met Gin Val Tyr Gin Ala Gly Ser Gly Gin Leu Met Phe Met Asn Lys 

235 240 245 

tac cat gga egg aag eta teg atg cag ggc ttc aag gag gca ctt ttc 1041 
10 Tyr His Gly Arg Lys Leu Ser Met Gin Gly Phe Lys Glu Ala Leu Phe 
250 255 260 

cag ttc ttc cac aat ggg egg tac ctg cgc cgt gaa etc ctg ggc cct 1089 
Gin Phe Phe His Asn Gly Arg Tyr Leu Arg Arg Glu Leu Leu Gly Pro 
265 270 275 

15 gtg etc aag aag ctg act gag etc aag gca gtg ttg gag cga cag gag 1137 
Val Leu Lys Lys Leu Thr Glu Leu Lys Ala Val Leu Glu Arg Gin Glu 

280 285 290 

tec tac cgc ttc tac tea age tec ctg ctg gtc att tat gat ggc aag 1185 
Ser Tyr Arg Phe Tyr Ser Ser Ser Leu Leu Val lie Tyr Asp Gly Lys 
20 295 300 305 310 

gag egg ccc gaa gtg gtc ctg gac tea gat get gag gat ttg gag gac 1233 
Glu Arg Pro Glu Val Val Leu Asp Ser Asp Ala Glu Asp Leu Glu Asp 

315 320 325 

ctg tea gag gaa tea get gat gag tct get ggt gee tat gee tac aaa 1281 
25 Leu Ser Glu Glu Ser Ala Asp Glu Ser Ala Gly Ala Tyr Ala Tyr Lys 
330 335 340 

ccc ate ggc gee age tct gta gat gtg cgc atg ate gac ttt gca cac 1329 
Pro lie Gly Ala Ser Ser Val Asp Val Arg Met lie Asp Phe Ala His 
345 350 355 

30 ace ace tgc agg ctg tat ggc gag gac ace gtg gtg cat gag ggc cag 1377 
Thr Thr Cys Arg Leu Tyr Gly Glu Asp Thr Val Val His Glu Gly Gin 

360 365 370 

gat get ggc tat ate ttc ggg etc cag age ctg ata gac att gtc aca 1425 
Asp Ala Gly Tyr lie Phe Gly Leu Gin Ser Leu lie Asp lie Val Thr 
35 375 380 385 390 

gag ata agt gag gag agt ggg gag tgagcttget agctgctcca gtacttgaga 1479 
Glu lie Ser Glu Glu Ser Gly Glu 
395 

gcgactctgt gtcccaggca cagctgtgct gcgtcaggga ggaagccagt atggccaggt 1539 
40 ggtggctcct gcagcctgga gctgatgtgc agtggcctct gtgagcccca gcctgagcca 1599 
gtcccagctg tgcttggagt ctttatttat tttaactatt tcttcaacat tccacatttg 1659 
atgatgatac ctctttcttc cctgagtgta tatgttctaa tacaaatctt tttgtttatt 1719 
gaaaaaaaaa aaaaaaaa 1737 

45 <210> 70 
<211> 1637 
<212> DNA 
<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 30 . . 1427 

<220> 

55 <221> sig_peptide 
<222> 30 . . 77 
<223> Von Heijne matrix 

score 3.71064775937629 
seq YAAAAGVLAGVES/RQ 
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<400> 70 

ctaatcgaaa agtaaaggcg cgegggaac atg ggg ctg tat get gca get gca 53 

Met Gly Leu Tyr Ala Ala Ala Ala 
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gtc 


ate 


gat 


725 




Leu 


Pro 


Ala 


Met 


Leu 


Leu 


Asp 


Pro 


Pro 


Pro 


Gly 


Ser 


His 


val 


He 


Asp 














205 










210 










215 








gcc 


tgt 


gcc 


gcc 


cca 


ggc 


aat 


aag 


ace 


agt 


cac 


ttg 


get 


get 


ctt 


ctg 


773 


45 


Ala 


Cys 


Ala 


Ala 


Pro 


Gly 


Asn 


Lys 


Thr 


Ser 


His 


Leu 


Ala 


Ala 


Leu 


Leu 












220 










225 










230 










aag 


aac 


caa 


ggg 


aag 


ate 


ttt 


gcc 


ttt 


gac 


ctg 


gat 


gcc 


aag 


egg 


ctg 


821 




Lys 


Asn 


Gin 


Gly 


Lys 


lie 


Phe 


Ala 


Phe 


Asp 


Leu 


Asp 


Ala 


Lys 


Arg 


Leu 










235 










240 










245 










50 


gca 


tec 


atg 


gcc 


acg 


ctg 


ctg 


gcc 


egg 


get 


ggc 


gtc 


tct 


tgc 


tgt 


gaa 


869 




Ala 


Ser 


Met 


Ala 


Thr 


Leu 


Leu 


Ala 


Arg 


Ala 


Gly 


Val 


Ser 


Cys 


Cys 


Glu 








250 










255 










260 














ctg 


get 


gag 


gag 


gac 


ttc 


ctg 


gcg 


gtc 


tec 


ccc 


teg 


gat 


cca 


cgc 


tac 


917 




Leu 


Ala 


Glu 


Glu 


Asp 


Phe 


Leu 


Ala 


Val 


Ser 


Pro 


Ser 


Asp 


Pro 


Arg 


Tyr 




55 


265 










270 










275 










280 






cat 


gag 


gtc 


cac 


tac 


ate 


ctg 


ctg 


gat 


cct 


tec 


tgc 


agt 


ggc 


teg 


ggt 


965 




His 


Glu 


Val 


His 


Tyr 


lie 


Leu 


Leu 


Asp 


Pro 


Ser 


Cys 


Ser 


Gly 


Ser 


Gly 














285 










290 










295 








atg 


ccg 


age 


aga 


cag 


ctg 


gag 


gag 


ccc 


ggg 


gca 


ggc 


aca 


cct 


age 


ccg 


1013 


60 


Met 


Pro 


Ser 


Arg 


Gin 


Leu 


Glu 


Glu 


Pro 


Gly 


Ala 


Gly 


Thr 


Pro 


Ser 


Pro 












300 










305 










310 










gtg 


cgt 


ctg 


cat 


gcc 


ctg 


gca 


ggc 


ttc 


cag 


cag 


cga 


gcc 


ctg 


tgc 


cac 


1061 




Val 


Arg 


Leu 


His 


Ala 


Leu 


Ala 


Gly 


Phe 


Gin 


Gin 


Arg 


Ala 


Leu 


Cys 


His 
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315 320 325 

gcg etc act ttc cct tec ctg cag egg etc gtc tac tec acg tgc tec 1109 
Ala Leu Thr Phe Pro Ser Leu Gin Arg Leu Val Tyr Ser Thr Cys Ser 
330 335 340 

5 etc tgc cag gag gag aat gaa gac gtg gtg cga gat gcg ctg cag cag 1157 
Leu Cys Gin Glu Glu Asn Glu Asp Val Val Arg Asp Ala Leu Gin Gin 
345 350 355 360 

aac ccg ggc gee ttc agg eta get ccc gee ctg cct gee tgg ccc cac 1205 
Asn Pro Gly Ala Phe Arg Leu Ala Pro Ala Leu Pro Ala Trp Pro His 
10 365 370 375 

c 9 a 99C ctg age acg ttc ccg ggt gee gag cac tgc etc egg gee tec 1253 
Arg Gly Leu Ser Thr Phe Pro Gly Ala Glu His Cys Leu Arg Ala Ser 

380 385 390 

cct gag ace aca etc age agt ggc ttc ttc gtt get gta att gaa egg 1301 
15 Pro Glu Thr Thr Leu Ser Ser Gly Phe Phe Val Ala Val lie Glu Arg 
395 400 405 

gtc gag gtg cca age tea gee tea cag gee aaa gca tea gca cca gaa 1349 
Val Glu Val Pro Ser Ser Ala Ser Gin Ala Lys Ala Ser Ala Pro Glu 
410 415 420 

20 cgc aca ccc age cca gee cca aag aga aag aag aga cag caa aga gee 1397 
Arg Thr Pro Ser Pro Ala Pro Lys Arg Lys Lys Arg Gin Gin Arg Ala 
425 430 435 440 

gca gee ggt get tgc aca ccg cct tgc aca tagcagaggc teegggctga 1447 
Ala Ala Gly Ala Cys Thr Pro Pro Cys Thr 
25 445 450 

ctccttcctg gtgggaaagg aagatgcctg tcctctccgt ggaggaccct gggccctcac 1507 
egcaggaage agtttgggtt ttgaaaggtt attgggtccc ttccttgggc tgtgttcttg 1567 
ctggtgagca aagtgttacc tgcaaaaata aaatgcagaa cgtactctac gacaaaaaaa 1627 
aaaaaaaaaa 1637 

30 

<210> 71 
<211> 1636 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 30 . . 1175 

40 <220> 

<221> sig_ peptide 

<222> 30. .77 

<223> Von Heijne matrix 

score 3.71064775937629 
45 seq YAAAAG VL AG VE S / RQ 

<400> 71 

ctaatcgaaa agtaaaggcg cgegggaac atg ggg ctg tat get gca get gca 53 

Met Gly Leu Tyr Ala Ala Ala Ala 

50 -15 -10 

ggc gtg ttg gec ggc gtg gag age cgc cag ggc tct ate aag ggg ttg 101 

Gly Val Leu Ala Gly Val Glu Ser Arg Gin Gly Ser lie Lys Gly Leu 

-5 1 5 

gtg tac tec age aac ttc cag aac gtg aag cag ctg tac gcg ctg gtg 149 

55 Val Tyr Ser Ser Asn Phe Gin Asn Val Lys Gin Leu Tyr Ala Leu Val 

10 15 20 

tgc gaa acg cag cgc tac tec gee gtg ctg gat get gtg ate gee age 197 

Cys Glu Thr Gin Arg Tyr Ser Ala Val Leu Asp Ala Val He Ala Ser 

25 30 35 40 

60 gee ggc etc etc cgt gcg gag aag aag ctg egg ccg cac ctg gee aag 245 

Ala Gly Leu Leu Arg Ala Glu Lys Lys Leu Arg Pro His Leu Ala Lys 

45 50 55 

gtg eta gtg tat gag ttg ttg ttg gga aag ggc ttt cga ggg ggt ggg 293 

83 
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Val Leu Val Tyr Glu Leu Leu Leu Gly Lys Gly Phe Arg Gly Gly Gly 

60 65 70 

ggc cga tgg aag get ctg ttg ggc egg cac cag gcg agg etc aag get 341 
Gly Arg Trp Lys Ala Leu Leu Gly Arg His Gin Ala Arg Leu Lys Ala 
5 75 80 85 

gag ttg get egg etc aag gtt cat egg ggt gtg age egg aat gag gac 389 
Glu Leu Ala Arg Leu Lys Val His Arg Gly Val Ser Arg Asn Glu Asp 

90 95 100 

ctg ttg gaa gtg gga tec agg cct ggt cca gec tec cag ctg cct cga 437 
10 Leu Leu Glu Val Gly Ser Arg Pro Gly Pro Ala Ser Gin Leu Pro Arg 
105 110 115 120 

ttt gtg cgt gtg aac act etc aag ace tgc tec gat gat gta gtt gat 485 
Phe Val Arg Val Asn Thr Leu Lys Thr Cys Ser Asp Asp Val Val Asp 
125 130 135 

15 tat ttc aag aga caa ggt ttc tec tat cag ggt egg get tec age etc 533 
Tyr Phe Lys Arg Gin Gly Phe Ser Tyr Gin Gly Arg Ala Ser Ser Leu 

140 145 150 

gat gac tta cga gec etc aag ggg aag cat ttt etc ctg gac ccc ttg 581 
Asp Asp Leu Arg Ala Leu Lys Gly Lys His Phe Leu Leu Asp Pro Leu 
20 155 ~ 160 165 

atg ccg gag ctg ctg gtg ttt ccc gee cag aca gat ctg cat gaa cac 629 
Met Pro Glu Leu Leu Val Phe Pro Ala Gin Thr Asp Leu His Glu His 

170 175 180 

cca ctg tac egg gee gga cac etc att ctg cag gac agg gee age tgt 677 
25 Pro Leu Tyr Arg Ala Gly His Leu lie Leu Gin Asp Arg Ala Ser Cys 
185 190 195 200 

etc cca gec atg ctg ctg gac ccc ccg cca ggc tec cat gtc ate gat 725 
Leu Pro Ala Met Leu Leu Asp Pro Pro Pro Gly Ser His Val lie Asp 
205 210 215 

30 gee tgt gee gee cca ggc aat aag ace agt cac ttg get get ctt ctg 773 
Ala Cys Ala Ala Pro Gly Asn Lys Thr Ser His Leu Ala Ala Leu Leu 

220 225 230 

aag aac caa ggg aag ate ttt gec ttt gac ctg gat gee aag egg ctg 821 
Lys Asn Gin Gly Lys lie Phe Ala Phe Asp Leu Asp Ala Lys Arg Leu 
35 235 240 245 

gca tec atg gec acg ctg ctg gee egg get ggc gtc tct tgc tgt gaa 869 
Ala Ser Met Ala Thr Leu Leu Ala Arg Ala Gly Val Ser Cys Cys Glu 

250 255 260 

ctg get gag gag gac ttc ctg gcg gtc tec ccc teg gat cca cgc tac 917 
40 Leu Ala Glu Glu Asp Phe Leu Ala Val Ser Pro Ser Asp Pro Arg Tyr 
265 270 275 280 

cat gag gtc cac tac ate ctg ctg gat cct tec tgc agt ggc teg ggt 965 
His Glu Val His Tyr lie Leu Leu Asp Pro Ser Cys Ser Gly Ser Gly 
285 290 295 

45 atg ccg age aga cag ctg gag gag ccc ggg gca ggc aca cct age ccg 1013 
Met Pro Ser Arg Gin Leu Glu Glu Pro Gly Ala Gly Thr Pro Ser Pro 

300 305 310 

gtg cgt ctg cat gee ctg gca get tec age age gag ccc tgt gee acg 1061 
Val Arg Leu His Ala Leu Ala Ala Ser Ser Ser Glu Pro Cys Ala Thr 
50 315 320 325 

cgc tea ctt tec ctt ccc tgc age ggc teg tct act cca cgt get ccc 1109 
Arg Ser Leu Ser Leu Pro Cys Ser Gly Ser Ser Thr Pro Arg Ala Pro 

330 335 340 

tct gee agg agg aga atg aag acg tgg tgc gag atg cgc tgc age aga 1157 
55 Ser Ala Arg Arg Arg Met Lys Thr Trp Cys Glu Met Arg Cys Ser Arg 
345 350 355 360 

ace egg gcg cct tea ggc tagctcccgc cctgcctgcc tggccccacc 1205 
Thr Arg Ala Pro Ser Gly 
365 

60 gaggectgag cacgttcccg ggtgccgagc actgcctccg ggcctcccct gagaccacac 1265 
tcagcagtgg ettcttegtt gctgtaattg aaegggtega ggtgccaagc tcagcctcac 1325 
aggecaaage atcagcacca gaacgcacac ccagcccagc cccaaagaga aagaagagac 13 85 
agcaaagagc cgcagccggt gcttgcacac cgccttgcac atagcagagg ctccgggctg 1445 

84 
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actccttcct ggtgggaaag gaagatgcct gtcctctccg tggaggaccc tgggccctca 1505 

ccgcaggaag cagtttgggt tttgaaaggt tattgggtcc cttccttggg ctgtgttctt 1565 

gctggtgagc aaagtgttac ctgcaaaaat aaaatgcaga acgtactcta cgacaaaaaa 1625 

aaaaaaaaaa a 1636 

<210> 72 

<211> 1758 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 66 . .839 

15 <220> 

<221> sigjpeptide 

<222> 66 . . 173 

<223> Von Heijne matrix 

score 4.89555877630516 
20 seq LLLLRLNDAALRA/LQ 

<400> 72 

a 9 a 99 a 99 fc 9 gcggtggtgg ccctcgcctg tggcccccgt gctgcttgca ctcgaactcg 60 
tcgcc atg gag gag etc cag gag cct ctg aga gga gag etc egg etc tgc 110 

25 Met Glu Glu Leu Gin Glu Pro Leu Arg Gly Glu Leu Arg Leu Cys 

-35 -30 -25 

ttc acg caa get gee egg act age etc tta ctg etc agg etc aac gac 158 
Phe Thr Gin Ala Ala Arg Thr Ser Leu Leu Leu Leu Arg Leu Asn Asp 
-20 -15 -10 

30 get gee ctg egg gcg ctg caa gag tgt cag egg caa cag gta egg ccg 206 
Ala Ala Leu Arg Ala Leu Gin Glu Cys Gin Arg Gin Gin Val Arg Pro 
-5 1 5 10 

gtg att get ttc caa ggc cac cga ggg tat ctg aga etc cca ggc cct 254 
Val lie Ala Phe Gin Gly His Arg Gly Tyr Leu Arg Leu Pro Gly Pro 

35 15 20 25 

ggt tgg tec tgc etc ttc tec ttc ata gtg tec cag tgt tgt cag gag 302 
Gly Trp Ser Cys Leu Phe Ser Phe lie Val Ser Gin Cys Cys Gin Glu 

30 35 40 

ggc get ggt ggt age ttg gac ctt gtg tgc caa cgc ttc etc agg tct 3 50 

40 Gly Ala Gly Gly Ser Leu Asp Leu Val Cys Gin Arg Phe Leu Arg Ser 
45 50 55 

ggg cct aac age etc cac tgc ctg ggc tea etc agg gag cgc etc att 398 
Gly Pro Asn Ser Leu His Cys Leu Gly Ser Leu Arg Glu Arg Leu lie 
60 65 70 75 

45 att tgg gca gee atg gat tct ate cca gee cca tea tea gtt cag gga 446 
lie Trp Ala Ala Met Asp Ser lie Pro Ala Pro Ser Ser Val Gin Gly 

80 85 90 

cac aac ctg act gaa gat gee aga cat cct gag agt tgg cag aac aca 4 94 

His Asn Leu Thr Glu Asp Ala Arg His Pro Glu Ser Trp Gin Asn Thr 

50 95 100 105 

gga ggc tat tct gaa gga gat gca gta tea cag cca cag atg gca eta 542 
Gly Gly Tyr Ser Glu Gly Asp Ala Val Ser Gin Pro Gin Met Ala Leu 

110 115 120 

gag gag gtg tea gtg tea gat cca ctg gca age aac caa gga cag tea 590 

55 Glu Glu Val Ser Val Ser Asp Pro Leu Ala Ser Asn Gin Gly Gin Ser 
125 130 135 

etc cca gga tec tea agg gag cac atg gca cag tgg gaa gtg aga age 638 
Leu Pro Gly Ser Ser Arg Glu His Met Ala Gin Trp Glu Val Arg Ser 
140 145 150 155 

60 cag ace cat gtt cca aac aga gaa cct gtt cag gca ctg cct tec tct 686 
Gin Thr His Val Pro Asn Arg Glu Pro Val Gin Ala Leu Pro Ser Ser 

160 165 170 

gee age egg aaa cgt ctg gac aag aaa cgt tea gtg cct gta gee act 734 
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Ala Ser Arg Lys Arg Leu Asp Lys Lys Arg Ser Val Pro Val 
175 180 185 

gta gaa ctg gaa gaa aag agg ttc aga act ctg cct tta gtg 
Val Glu Leu Glu Glu Lys Arg Phe Arg Thr Leu Pro Leu Val 
5 190 195 200 

cct aca agg cct gac caa tea gga ttt aca aga ggg aga aga 
Pro Thr Arg Pro Asp Gin Ser Gly Phe Thr Arg Gly Arg Arg 

205 210 215 

gca aga aga tgaggacatg gaccccagat tagaacacaa ttcctcagtt 
10 Ala Arg Arg 
220 

caagaagatt ctgaatcccc aagtcctgaa gatataccag actacctcct 
gccatccaca gtgeagaaca gcaacatgcc tatgagcagg actttgagac 
gaataccgea tcctgcatgc ccgtgttggg actgcaagcc aaaggttcat 

15 gcagagatta aaagagttcg gcgaggaact ccagaataca aggtcctgga 
atccaggaat ataaaaagtt caggaagcag tacccaagtt acagagaaga 
tgtgagtacc ttcaccagaa attgtcccac attaaaggtc tcatcctgga 
aagaacaggg gcagctgaag ttatcaaggg aatttttgag cctctgctta 
aaggaacaaa gcagctataa actaaataga atgeaactat ctgettttet 

20 actggagtcc atggtggcaa gtagagagct gctctaggtt cttgaggttt 
attaattttt agggtatggg cactgtgcaa agactccata gctgtgccta 
aaaagtgaca gaggcttggc ttttttacct ttagttcagc caagtcattt 
agaaatgaca tcatcttcag gataaaataa tgaggacatt agacaaacca 
attttagect ggtagectet ctaaggaaac agtaataata acttctgata 

25 gaacttgtag catacctgga tataaeggga aagggcctgg gtgttaccca 
atgaactttt accaacatgg ccaaaaaaaa aaaaaaaaa 



Ala Thr 

ccc ccc 
Pro Pro 

ttg gga 
Leu Gly 



gcaatacagg 
agattatget 
agagctggga 
agacaagata 
aaagegtege 
gtttgaggaa 
gtgaaacaca 
tatgetgace 
ggttttcatt 
ggagtctagg 
tcaagtcctg 
aactaagtga 
agagttaaaa 
tgtactgaaa 



782 



830 



879 



939 
999 
1059 
1119 
1179 
1239 
1299 
1359 
1419 
1479 
1539 
1599 
1659 
1719 
1758 



<210> 73 

<211> 1647 

30 <212> DNA 

<213> Homo sapiens 



35 



<220> 
<221> CDS 
<222> 64 . . 903 



<220> 

<221> sig_peptide 

<222> 64 . . 162 
40 <223> Von Heijne matrix 

score 10.6748773272319 
seq LLLLPFLPLLLLA/AP 



<40 
45 age 
aga 



ccc 
50 Pro 

ctt 

Leu 

55 gtg 
Val 
15 
tac 
Tyr 



60 



ttc 
Phe 



0> 73 

tcaaggg gect 
atg aag agt 
Met Lys Ser 

gcg gcg tgg 
Ala Ala Trp 
-15 

gca gec ccc 
Ala Ala Pro 
1 

cat ggg etc 
His Gly Leu 

ate aat gag 
lie Asn Glu 

gat ggg aga 
Asp Gly Arg 
50 



cgagga ctctc 
tgc ggg age 
Cys Gly Ser 
-30 

gtc ctg ctt 
Val Leu Leu 



gcg 
Ala 

ttc 
Phe 

aca 

Thr 

35 

gag 

Glu 



ccc cac 
Pro His 
5 

gac age 
Asp Ser 
20 

cac ccc 
His Pro 

age ttg 
Ser Leu 



tgcgt ctctggagac 
atg ctg ggg etc 
Met Leu Gly Leu 
-25 

ctg ttg cct ttc 
Leu Leu Pro Phe 
-10 

cgc gcg tec tac 
Arg Ala Ser Tyr 



teg tac 
Ser Tyr 

ggg act 
Gly Thr 

cga ccc 
Arg Pro 
55 



age 
Ser 

gtg 

Val 

40 

ctg 

Leu 



ttc 

Phe 

25 

gtg 

Val 

tgg 
Trp 



aagggcacta 
tgg ggg cag 
Trp Gly Gin 

ctg ccg ctg 
Leu Pro Leu 
-5 

aag ccg gtc 
Lys Pro Val 
10 

cgc cac ctg 
Arg His Leu 

aca gtg etc 
Thr Val Leu 

gaa cag gtg 
Glu Gin Val 
60 



cacgcacttc 
egg etc 
Arg Leu 
-20 

ctg ctg 
Leu Leu 

ate gtg 
lie Val 

ctg gaa 
Leu Glu 

30 
gat etc 
Asp Leu 
45 

caa ggg 
Gin Gly 



60 
108 



156 



204 



252 



300 



348 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



ttc cga gag get gtg gtc ccc ate atg gca aag gee cct caa ggg gtg 396 
Phe Arg Glu Ala Val Val Pro lie Met Ala Lys Ala Pro Gin Gly Val 

65 70 75 

cat etc ate tgc tac teg cag ggg ggc ctt gtg tgc egg get ctg ctt 444 
His Leu lie Cys Tyr Ser Gin Gly Gly Leu Val Cys Arg Ala Leu Leu 

80 85 90 

tct gtc atg gat gat cac aac gtg gat tct ttc ate tec etc tec tct 492 
Ser Val Met Asp Asp His Asn Val Asp Ser Phe lie Ser Leu Ser Ser 
95 100 105 110 

cca cag atg gga cag tat gga gac acg gac tac ttg aag tgg ctg ttc 54 0 

Pro Gin Met Gly Gin Tyr Gly Asp Thr Asp Tyr Leu Lys Trp Leu Phe 

115 120 125 

ccc acc tec atg egg tct aac etc tat egg ate tgc tat age ccc ctg 588 
Pro Thr Ser Met Arg Ser Asn Leu Tyr Arg lie Cys Tyr Ser Pro Leu 

130 135 140 

ate aat ggg gaa aga gac cat ccc aat gec aca gta tgg egg aag aac 636 
lie Asn Gly Glu Arg Asp His Pro Asn Ala Thr Val Trp Arg Lys Asn 

145 150 155 

ttt ctg cgt gtg ggc cac ctg gtg ctg att ggg ggc cct gat gat ggt 684 
Phe Leu Arg Val Gly His Leu Val Leu lie Gly Gly Pro Asp Asp Gly 

160 165 170 

gtt att act ccc tgg cag tec age ttc ttt ggt ttc tat gat gca aat 732 
Val lie Thr Pro Trp Gin Ser Ser Phe Phe Gly Phe Tyr Asp Ala Asn 
175 180 185 190 

gag acc gtc ctg gag atg gag gag caa ctg gtt tat ctg egg gat tct 780 
Glu Thr Val Leu Glu Met Glu Glu Gin Leu Val Tyr Leu Arg Asp Ser 

195 200 205 

ttt ggg ttg aag act eta ttg gee egg ggg gee ata gtg agg tgt cca 828 
Phe Gly Leu Lys Thr Leu Leu Ala Arg Gly Ala lie Val Arg Cys Pro 

210 215 220 

atg gee ggt ate tec cac aca gee tgg cac tec aac cgt acc ctt tat 876 
Met Ala Gly lie Ser His Thr Ala Trp His Ser Asn Arg Thr Leu Tyr 

225 230 235 

gag acc tgc att gaa cct tgg etc tec tgaggatata ttcaggggtc 923 
Glu Thr Cys lie Glu Pro Trp Leu Ser 

240 245 
cccaggaact cctcggtcca gagaccaagt ggtggccttg gaaagcagat gtcaggcttt 983 
ggtgtgcctg tgaccacctc attgctccca tattatcccc catttttagt agagaegggg 1043 
ttttagtaga gaettggect cccagaaccc ccttcctctg ctcctccatg aatgacaatt 1103 
ccaggcctcc cctacctcat gtcctctcat ttgggggatt gctccgtgct gtccctttct 1163 
ctcaaggccg aagttcggaa gtgagaaacc atgtttttaa cttgtggctg ctcttgctgc 1223 
tgctgctcct ccgtatctgg ctgtatgggt ggagaaccca cccactgccc accacagggg 12 83 
tctccttcca ggccactcag gacattttta gcttctctcc tccccatgtt cccttttttc 1343 
tctaaagtcc cctgacatca gccctcccaa ctcctaagag ggactaccca tgagagtggg 1403 
gttctgaggc tcccctatgg ggacagttcc gttcttgaag tgtcagtgtt ggggaatatc 1463 
tgtggcctat gaggcccatc tcaggtttgg ggatccccca gtccctatga tcagtgttgg 1523 
agtacccccc tgggagagee tagtttcttt gaggccccag gccctctttt aactaccttt 1583 
gaataggtgt tatccctgta tttatggaaa taaagttcca tttcctcaaa aaaaaaaaaa 1643 
aaaa 1647 

<210> 74 

<211> 1646 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> 
<222> 



CDS 
64 . . 



585 



60 <220> 

<221> sig_peptide 

<222> 64 . . 162 

<223> Von Heijne matrix 



87 
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score 10.6748773272319 
seq LLLLPFLPLLLLA/AP 



<400> 74 

5 agctcaaggg gcctcgagga ctctctgcgt ctctggagac 
aga atg aag agt tgc ggg age atg ctg ggg etc 
Met Lys Ser Cys Gly Ser Met Leu Gly Leu 
-30 -25 
ccc gcg gcg tgg gtc ctg ctt ctg ttg cct ttc 

10 Pro Ala Ala Trp Val Leu Leu Leu Leu Pro Phe 
-15 -10 
ctt gca gec ccc gcg ccc cac cgc gcg tec tac 
Leu Ala Ala Pro Ala Pro His Arg Ala Ser Tyr 
1 5 

15 gtg cat ggg etc ttc gac age teg tac age ttc 
Val His Gly Leu Phe Asp Ser Ser Tyr Ser Phe 
15 20 25 

tac ate aat gag aca cac ccc ggg act gtg gtg 
Tyr lie Asn Glu Thr His Pro Gly Thr Val Val 

20 35 40 

ttc gat ggg aga gag age ttg cga ccc ctg tgg 
Phe Asp Gly Arg Glu Ser Leu Arg Pro Leu Trp 

50 55 
ttc cga gag get gtg gtc ccc ate atg gca aag 

25 Phe Arg Glu Ala Val Val Pro He Met Ala Lys 
65 70 
cat etc ate tgc tac teg cag ggg ggc ctt gtg 
His Leu He Cys Tyr Ser Gin Gly Gly Leu Val 
80 85 

30 tct gtc atg gat gat cac aac gtg gat tct ttc 
Ser Val Met Asp Asp His Asn Val Asp Ser Phe 
95 100 105 

cca cag atg gga cag tat gga gac acg gac tac 
Pro Gin Met Gly Gin Tyr Gly Asp Thr Asp Tyr 

35 115 120 

ccc ace tec atg egg tct aac etc tat egg ate 
Pro Thr Ser Met Arg Ser Asn Leu Tyr Arg He 

130 135 
tgatcaatgg ggaaagagac catcccaatg ccacagtatg 

40 tgggccacct ggtgctgatt gggggccctg atgatggtgt 
gcttctttgg tttctatgat gcaaatgaga ccgtcctgga 
atetgeggga ttcttttggg ttgaagactc tattggcccg 
caatggcegg tatctcccac acagcctggc actccaaccg 
ttgaaccttg gctctcctga ggatatattc aggggtcccc 

45 accaagtggt ggccttggaa agcagatgtc aggctttggt 
gctcccatat tatcccccat ttttagtaga gacggggttt 
agaaccccct tcctctgctc ctccatgaat gacaattcca 
ctctcatttg ggggattget ccgtgctgtc cctttctctc 
agaaaccatg tttttaactt gtggctgctc ttgctgctgc 

50 tatgggtgga gaacccaccc actgcccacc acaggggtct 
atttttagct tctctcctcc ccatgttccc ttttttctct 
ctcccaactc ctaagaggga ctacccatga gagtggggtt 
cagttccgtt cttgaagtgt cagtgttggg gaatatctgt 
ggtttgggga tcccccagtc cctatgatca gtgttggagt 

55 tttctttgag gccccaggcc ctcttttaac tacctttgaa 
atggaaataa agttccattt cctcaaaaaa aaaaaaaaaa 



aagggcacta cacgcacttc 60 
tgg ggg cag egg etc 108 
Trp Gly Gin Arg Leu 
-20 

ctg ccg ctg ctg ctg 156 
Leu Pro Leu Leu Leu 
-5 

aag ccg gtc ate gtg 204 

Lys Pro Val He Val 

10 

cgc cac ctg ctg gaa 252 
Arg His Leu Leu Glu 
3 0 

aca gtg etc gat etc 300 
Thr Val Leu Asp Leu 
45 

gaa cag gtg caa ggg 348 
Glu Gin Val Gin Gly 
60 

gee cct caa ggg gtg 396 
Ala Pro Gin Gly Val 
75 

tgc egg get ctg ctt 444 

Cys Arg Ala Leu Leu 

90 

ate tec etc tec tct 492 
He Ser Leu Ser Ser 
110 

ttg aag tgg ctg ttc 540 
Leu Lys Trp Leu Phe 
125 

tgc tat age ccc 585 
Cys Tyr Ser Pro 
140 

geggaagaac tttctgcgtg 645 
tattactccc tggcagtcca 705 
gatggaggag caactggttt 765 
gggggecata gtgaggtgtc 825 
taccctttat gagacctgea 885 
aggaactcct eggtccagag 945 
gtgcctgtga ccacctcatt 1005 
tagtagagac ttggcctccc 1065 
ggcctcccct acctcatgtc 1125 
aaggccgaag ttcggaagtg 1185 
tgctcctccg tatctggctg 1245 
ccttccaggc cactcaggac 13 05 
aaagtcccct gacatcagcc 1365 
ctgaggctcc cctatgggga 1425 
ggectatgag gcccatctca 1485 
acccccctgg gagagectag 1545 
taggtgttat ccctgtattt 1605 
a 1646 



<210> 75 

<211> 1963 

60 <212> DNA 

<213> Homo sapiens 

<220> 



88 



0142451 A2_l_> 



w 



WO 01/42451 PCT/1B00/01938 

<221> CDS 
<222> 274 . . 753 

<220> 

5 <22l> sig_peptide 
<222> 274 . . 324 
<223> Von Heijne matrix 

score 4.4969823290892 

seq FAAFCYMLSLVLC/AA 

10 

<400> 75 

cttcttcgat ttgcggacgg ttccctccag cgactctcga cacacgtttt cctgtcttcg 60 
ccggagggcc gggtctgggg tcgccggagc ctgcgggaat ccagcgctta ttcgctaacc 12 0 
ctcgagtcgc ttcgctagct gtgcgccctc ctgggcacta gcctggagag gagcgtgcag 180 
15 acgcggctcc ttggagggag tgcggtcctc tagggaggca tcgggctcct aggggcttct 240 
tggcgtgtgt ggtgggattg gggtccgccg gcc atg gcc ttc act ttc get gcg 294 

Met Ala Phe Thr Phe Ala Ala 
-15 

ttc tgc tac atg ctg tct ctg gtg ctg tgc get gcg etc ate ttc ttc 342 
20 Phe Cys Tyr Met Leu Ser Leu Val Leu Cys Ala Ala Leu lie Phe Phe 
-10 -5 15 

gcc ate tgg cac ata att gcc ttt gat gag tta agg aca gat ttt aag 390 
Ala lie Trp His lie lie Ala Phe Asp Glu Leu Arg Thr Asp Phe Lys 
10 15 20 

25 age ccc ata gac cag tgc aat cct gtt cat gcg agg gaa egg ttg agg 438 
Ser Pro lie Asp Gin Cys Asn Pro Val His Ala Arg Glu Arg Leu Arg 

25 30 35 

aac ate gag cgc ate tgc ttc ctt ctg cga aag ctg gtg ctg cca gaa 486 
Asn lie Glu Arg lie Cys Phe Leu Leu Arg Lys Leu Val Leu Pro Glu 

30 40 45 50 

tac tec ate cat age etc ttc tgc att atg ttc ctg tgt gcg caa gag 534 

Tyr Ser lie His Ser Leu Phe Cys lie Met Phe Leu Cys Ala Gin Glu 

55 60 65 70 

tgg etc acg ctg ggg ctg aat gtc cct eta ctt ttc tat cac ttc tgg 582 

35 Trp Leu Thr Leu Gly Leu Asn Val Pro Leu Leu Phe Tyr His Phe Trp 

75 80 85 

agg tat ttc cac tgt cca gca gat age tea gaa eta gcc tac gac cca 630 
Arg Tyr Phe His Cys Pro Ala Asp Ser Ser Glu Leu Ala Tyr Asp Pro 
90 95 100 

40 ccg gtg gtc atg aat ccc gac act ttg agt tac tgt cag aag gag gcc 678 
Pro Val Val Met Asn Pro Asp Thr Leu Ser Tyr Cys Gin Lys Glu Ala 

105 110 115 

tgg tgt aag ctg gcc ttc tat etc etc tec ttc ttc tac tac ctt tac 726 
Trp Cys Lys Leu Ala Phe Tyr Leu Leu Ser Phe Phe Tyr Tyr Leu Tyr 

45 120 125 130 

tgc atg ate tac act tta gtg age tct taacgcaaag accatgcaca 773 
Cys Met lie Tyr Thr Leu Val Ser Ser 
135 140 

tcatcagaga ctgagatggg agaggectga gaeggagagg tgcatttctg ctggtgactg 833 
50 gaggagggac cagaatgagg ataegtgaga aatagacccg gcaggcagtc agactgaatg 893 
ggagctggaa tcacgcagca gttgggagcc gagttaaccc tgcgtgtctg tgtcaccctg 953 
tttgtcaatc tttggcattc gaattccaca cacggggtcc tagagecett ctgagcatca 1013 
gtggtgtggg ggagtaggtg acgaaacact agacctctcc tgagagagaa ttgetgette 1073 
ctgaatccac ttcattgaac agcaccttgc aagttcaaat gagttcctgg gageggagge 1133 
55 tggaaggeca caaggtgett gctaaggaac agaatgaccc agagtcaagg ccaagtctgc 1193 
agggacctgt tgaaagcetc gagaatgtct tggctgccca agactcttgt tgectttett 1253 
ccaagccatg gccatgccct ttttctcaaa tgggaggggc tggagggtgt gtgggatttg 1313 
tcttcagctg caaccagcct tgagectget gggctatttt cagctgagga ggggtaatat 1373 
aggaaaaatg catttttgaa acgtttgcaa catgatcaag gtgttagttc tccaccacac 1433 
60 aagttgtatt ettcttttge cacctcaaac catcacagag tctttaaatg caaatcaatt 1493 
ggtcaatget agtcaaagct atgttcttac aaaaacccca gacagctcag agctcagaaa 1553 
atcctgtgga gtggctgctc tgtaccgtgg gcatccggca gecaggaagt gagacaacat 1613 
aattataact ttgttttatg atgetgeate atttgtactg tttaggtcga cgtgaggaca 1673 
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tcatcttatt tagaattttc cgtttggcat tctcttttgg gtgggagtta tgctgggggt 1733 

tgtaaataat gacaaggctg agatttttat gatgtttaaa ttgggcacaa tgattttgac 1793 

cttattcccc aaacttcttt tcttttctac tgtttaacat acacaggcta tttatacacg 1853 

tccccagctc ccatctgaaa cctgtgactc aggtttatga atggtgtttg tgtagcaaca 1913 

5 cattgtgtgc tatgtttatt aaaatgcagc gacaaaaaaa aaaaaaaaaa 1963 

<210> 76 
<211> 1757 
<212> DNA 
10 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 191 . . 1468 

15 

<220> 

<221> sig_peptide 

<222> 191 . .274 

<223> Von Heijne matrix 

20 score 4.02941490119842 
seq GXLLEPFVHQVGG/HS 

<400> 76 

cattttggtg cgagagaaac aataggacgg aaacgccgag gaacccggct gaggcggcag 60 

25 cagagcatcc tggccagaac aagccaagga gccaagacga gagggacaca ctgacaaaca 120 
acagacagaa gacgtactgg ccgctggact ccgctgcctc ccccatctcc ccgccatctg 180 
cgcccggagg atg age cca gec ttc agg gec atg gat gtg gag ccc cgc 229 
Met Ser Pro Ala Phe Arg Ala Met Asp Val Glu Pro Arg 
-25 -20 

30 gec aaa ggc gtc ctt ctg gag ccc ttt gtc cac cag gtc ggg ggg cac 277 
Ala Lys Gly Val Leu Leu Glu Pro Phe Val His Gin Val Gly Gly His 
-15 -10 -5 1 

tea tgc gtg etc cgc ttc aat gag aca acc ctg tgc aag ccc ctg gtc 325 
Ser Cys Val Leu Arg Phe Asn Glu Thr Thr Leu Cys Lys Pro Leu Val 

35 5 10 15 

cca agg gaa cat cag ttc tac gag acc etc cct tct gag atg cgc aaa 373 
Pro Arg Glu His Gin Phe Tyr Glu Thr Leu Pro Ser Glu Met Arg Lys 

20 25 30 

ttc act ccc cag tac aaa ggt gtg gta tct gtg cgc ttt gaa gaa gat 421 

40 Phe Thr Pro Gin Tyr Lys Gly Val Val Ser Val Arg Phe Glu Glu Asp 
35 40 45 

gaa gac agg aac ttg tgt eta ata gca tat cca ttg aaa ggg gac cat 4 69 

Glu Asp Arg Asn Leu Cys Leu lie Ala Tyr Pro Leu Lys Gly Asp His 
50 55 60 65 

45 gga att gtg gac att gta gat aat tea gac tgt gaa cca aaa agt aag 517 
Gly lie Val Asp lie Val Asp Asn Ser Asp Cys Glu Pro Lys Ser Lys 

70 75 80 

etc eta agg tgg aca aca aac aaa aaa cat cat gtc tta gaa aca gaa 565 
Leu Leu Arg Trp Thr Thr Asn Lys Lys His His Val Leu Glu Thr Glu 

50 85 90 95 

aag acc cct aag gac tgg gtg cgt cag cac cgt aaa gag gag aaa atg 613 
Lys Thr Pro Lys Asp Trp Val Arg Gin His Arg Lys Glu Glu Lys Met 

100 105 110 

aag age cat aag tta gaa gaa gaa ttt gag tgg eta aag aaa tct gaa 661 

55 Lys Ser His Lys Leu Glu Glu Glu Phe Glu Trp Leu Lys Lys Ser Glu 
115 120 125 

gtc ttg tac tac act gta gag aag aag ggg aat ata agt tec cag ctt 709 
Val Leu Tyr Tyr Thr Val Glu Lys Lys Gly Asn lie Ser Ser Gin Leu 
130 135 140 145 

60 aaa cac tat aac cct tgg age atg aaa tgt cac cag caa cag tta cag 757 
Lys His Tyr Asn Pro Trp Ser Met Lys Cys His Gin Gin Gin Leu Gin 

150 155 160 

aga atg aag gag aat gca aag cat egg aac cag tac aaa ttt ate tta 805 

90 
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Arg Met Lys Glu Asn Ala Lys His Arg Asn Gin Tyr Lys Phe lie Leu 

165 170 175 

ctg gaa aac ctg act tec cgc tat gag gtg cct tgt gtc ctt gac etc 853 
Leu Glu Asn Leu Thr Ser Arg Tyr Glu Val Pro Cys Val Leu Asp Leu 
5 180 185 190 

aag atg ggc aca cga caa cat ggt gat gat get tea gag gag aag gca 901 
Lys Met Gly Thr Arg Gin His Gly Asp Asp Ala Ser Glu Glu Lys Ala 

195 200 205 

gee aac cag ate cga aaa tgt cag cag age aca tct gca gtc att ggt 94 9 

10 Ala Asn Gin lie Arg Lys Cys Gin Gin Ser Thr Ser Ala Val lie Gly 
210 215 220 225 

gtg cgt gtg tgt ggc atg cag gtg tac caa gca ggc agt ggg cag etc 997 
Val Arg Val Cys Gly Met Gin Val Tyr Gin Ala Gly Ser Gly Gin Leu 
230 235 240 

15 atg ttc atg aac aag tac cat gga egg aag eta teg gtg cag ggc ttc 1045 
Met Phe Met Asn Lys Tyr His Gly Arg Lys Leu Ser Val Gin Gly Phe 

245 250 255 

aag gag gca ctt ttc cag ttc ttc cac aat ggg egg tac ctg cgc cgt 1093 
Lys Glu Ala Leu Phe Gin Phe Phe His Asn Gly Arg Tyr Leu Arg Arg 
20 260 265 270 

gaa etc ctg ggc cct gtg etc aag aag ctg act gag etc aag gca gtg 1141 
Glu Leu Leu Gly Pro Val Leu Lys Lys Leu Thr Glu Leu Lys Ala Val 

275 280 285 

ttg gag cga cag gag tec tac cgc ttc tac tea age tec ctg ctg gtc 1189 
25 Leu Glu Arg Gin Glu Ser Tyr Arg Phe Tyr Ser Ser Ser Leu Leu Val 
290 295 300 305 

att tat gat ggc aag gag egg ccc gaa gtg gtc ctg gac tea gat get 1237 
lie Tyr Asp Gly Lys Glu Arg Pro Glu Val Val Leu Asp Ser Asp Ala 
310 315 320 

30 gag gat ttg gag gac ctg tea gag gaa tea get gat gag tct get ggt 1285 
Glu Asp Leu Glu Asp Leu Ser Glu Glu Ser Ala Asp Glu Ser Ala Gly 

325 330 335 

gee tat gee tac aaa ccc ate ggc gee age tct gta gat gtg cgc atg 1333 
Ala Tyr Ala Tyr Lys Pro lie Gly Ala Ser Ser Val Asp Val Arg Met 
35 ' 340 345 350 

ate gac ttt gca cac ace ace tgc agg ctg tat ggc gag gac ace gtg 1381 
lie Asp Phe Ala His Thr Thr Cys Arg Leu Tyr Gly Glu Asp Thr Val 

355 360 365 

gtg cat gag ggc cag gat get ggc tat ate ttc ggg etc cag age ctg 1429 
40 Val His Glu Gly Gin Asp Ala Gly Tyr lie Phe Gly Leu Gin Ser Leu 
370 375 380 385 

ata gac att gtc aca gag ata agt gag gag agt ggg gag tgagcttget 1478 
lie Asp lie Val Thr Glu lie Ser Glu Glu Ser Gly Glu 
390 395 
45 agctgctcca gtacttgaga gcgactctgt gtcccaggca cagctgtgct gcgtcaggga 1538 
ggaagccagt atggccaggt ggtggctcct gcagcctgga gctgatgtgc agtggcctct 1598 
gtgagcccca gcctgagcca gtcccagctg tgcttggagt ctttatttat tttaactatt 1658 
tcttcaacat tccacatttg atgatgatac ctctttcttc cctgagtgta tatgttctaa 1718 
tacaaatctt tttgtttatt gtaaaaaaaa aaaaaaaaa 1757 

50 

<210> 77 
<211> 2027 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 48 . . 950 

60 <220> 

<221> sig_peptide 

<222> 48 . . 107 

<223> Von Heijne matrix 

91 
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200 



248 



392 



440 



488 



score 6.64507667657896 
seq LLPLLSLLVGAWL/KL 

<400> 77 

5 atgcgcagcg gggccgtggg tgtacgcggc gcagcgcggc agtcctg atg gcc egg 56 

Met Ala Arg 
-20 

cat ggg tta ccg ctg ctg ccc ctg ctg teg etc ctg gtc ggc gcg tgg 104 
His Gly Leu Pro Leu Leu Pro Leu Leu Ser Leu Leu Val Gly Ala Trp 

10 -15 -10 -5 

etc aag eta gga aat gga cag get act age atg gtc caa ctg cag ggt 152 
Leu Lys Leu Gly Asn Gly Gin Ala Thr Ser Met Val Gin Leu Gin Gly 

1 5 ~ 10 15 

ggg aga ttc ctg atg gga aca aat tct cca gac age aga gat ggt gaa 

15 Gly Arg Phe Leu Met Gly Thr Asn Ser Pro Asp Ser Arg Asp Gly Glu 

20 25 30 

ggg cct gtg egg gag gcg aca gtg aaa ccc ttt gcc ate gac ata ttt 
Gly Pro Val Arg Glu Ala Thr Val Lys Pro Phe Ala lie Asp lie Phe 
35 4 0 4 5 

20 cct gtc ace aac aaa gat ttc agg gat ttt gtc agg gag aaa aag tat 2 96 

Pro Val Thr Asn Lys Asp Phe Arg Asp Phe Val Arg Glu Lys Lys Tyr 

50 55 60 

egg aca gaa get gag atg ttt gga ttg age ttt gtc ttt gag gac ttt 344 
Arg Thr Glu Ala Glu Met Phe Gly Leu Ser Phe Val Phe Glu Asp Phe 

25 65 70 75 

gtc tct gat gag ctg aga aac aaa gcc ace cag cca atg aag tct gta 
Val Ser Asp Glu Leu Arg Asn Lys Ala Thr Gin Pro Met Lys Ser Val 
80 85 90 95 

etc tgg tgg ctt cca gtg gaa aag gca ttt tgg agg cag cct gca ggt 

30 Leu Trp Trp Leu Pro Val Glu Lys Ala Phe Trp Arg Gin Pro Ala Gly 

100 105 110 

cct ggc tct ggc ate cga gag aga ctg gag cac cca gtg tta cac gtg 
Pro Gly Ser Gly lie Arg Glu Arg Leu Glu His Pro Val Leu His Val 
115 120 125 

35 age tgg aat gac gcc cgt gcc tac tgt get tgg egg gga aaa cga ctg 536 
Ser Trp Asn Asp Ala Arg Ala Tyr Cys Ala Trp Arg Gly Lys Arg Leu 

130 135 140 

ccc acg gag gaa gag tgg gag ttt gcc gcc cga ggg ggc ttg aag ggt 584 
Pro Thr Glu Glu Glu Trp Glu Phe Ala Ala Arg Gly Gly Leu Lys Gly 

40 145 150 155 

caa gtt tac cca tgg ggg aac tgg ttc cag cca aac cgc acc aac ctg 632 
Gin Val Tyr Pro Trp Gly Asn Trp Phe Gin Pro Asn Arg Thr Asn Leu 
160 165 170 175 

tgg cag gga aag ttc ccc aag gga gac aaa get gag gat ggc ttc cat 

45 Trp Gin Gly Lys Phe Pro Lys Gly Asp Lys Ala Glu Asp Gly Phe His 

180 185 190 

gga gtc tec cca gtg aat get ttc ccc gcc cag aac aac tac ggg etc 
Gly Val Ser Pro Val Asn Ala Phe Pro Ala Gin Asn Asn Tyr Gly Leu 
195 200 205 

50 tat gac etc ctg ggg aac gtg tgg gag tgg aca gca tea ccg tac cag 776 
Tyr Asp Leu Leu Gly Asn Val Trp Glu Trp Thr Ala Ser Pro Tyr Gin 

210 215 220 

get get gag cag gac atg cgc gtc etc egg ggg gca tec tgg ate gac 824 
Ala Ala Glu Gin Asp Met Arg Val Leu Arg Gly Ala Ser Trp He Asp 

55 225 230 235 

aca get gat ggc tct gcc aat cac egg gcc egg gtc acc acc agg atg 872 

Thr Ala Asp Gly Ser Ala Asn His Arg Ala Arg Val Thr Thr Arg Met 

240 245 250 255 

ggc aac act cca gat tea gcc tea gac aac etc ggt ttc cgc tgt get 920 

60 Gly Asn Thr Pro Asp Ser Ala Ser Asp Asn Leu Gly Phe Arg Cys Ala 

260 265 270 

gca gac gca ggc egg ccg cca ggg gag ctg taagcagccg ggtggtgaca 970 
Ala Asp Ala Gly Arg Pro Pro Gly Glu Leu 
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aggagaaaag 
caagctcgag 
tggcaggcgc 
5 cccaatgtgt 
gacacaggat 
agcattttaa 
tttcctcaag 
tgctttcttt 

10 gtgtaacagg 
aaccagatga 
cagaaagaca 
gggtctttct 
aattcctggg 

15 tgcaccacca 
cccagggtgg 
ctgggattac 
atgtccctgg 
ctcctgcaat 

20 

<210> 78 
<211> 1880 
<212> DNA 
<213> Homo 

25 

<220> 
<221> CDS 
<222> 156. 



275 
ccttctaggg 
agcttcagcc 
ctctcaccag 
gttgacgatg 
tgcaaacaca 
aatctattct 
gcagaatttt 
gtggcctcat 
cagacatgta 
tgtactaggt 
aatatcagaa 
ctgttgccca 
cccaagcaat 
tgcctggcta 
tctcgaactc 
aggtgtgagc 
agagtagcct 
tgtgtatctc 



sapiens 



tcactgtcat 
tcaggaaaga 
ggcaggagag 
gctgggggcc 
caaacaattg 
ctcccccttt 
cctggttctg 
ctgtggtttc 
actatttaaa 
gaagcattgc 
gcttcctatt 
ggctagagtg 
tctcccacct 
attttttgaa 
ctggcctcaa 
cacctcgcct 
gctcccacac 
aaaaaaaaaa 



180 

tccctggcca 
acttcccctt 
gactcagcct 
aggtgtttct 
gaacagagca 
ctccctggat 
ttttctcagc 
gtgtccctct 
gcacagttca 
attgtgggaa 
cttttttttt 
cactggtgat 
cagcctcctg 
tttttgtagt 
gcgatcctcc 
gggccccctt 
tgtcactgga 
aaaaaaa 



tgttgcaaac 
ccctgtctcc 
cctgtgtttt 
gttagaggcc 
ctctgaaagg 
gattcaggaa 
cagttgctgt 
gaaggaaact 
gtcctaaaag 
tcacaaagca 
tttttttttt 
cacggctcac 
agtagctggg 
gatgggatct 
cacctcgacc 
ctccatatgc 
tgtcatgggg 



agcgcaattc 
catccctctg 
ggagaagggg 
aagtattatt 
ccatttttta 
gctgacattg 
ggaaggagaa 
agtttccact 
ggtctgggag 
aatagtactc 
tttggagaca 
tctagccttg 
actacaagtg 
cgctctgttg 
tcccaaagtg 
ctccaaaaac 
ccaataaaat 



1030 
1090 
1150 
1210 
1270 
1330 
1390 
1450 
1510 
1570 
1630 
1690 
1750 
1810 
1870 
1930 
1990 
2027 



512 



30 <220> 

<221> sig_peptide 

<222> 156. .206 

<223> Von Heijne matrix 

score 3.55618791452243 
35 seq WLTAVASLLPSPG/NS 



<400> 78 

atatacaggt ggcagctctc gtcccctgag 

agcgctgcga cggcgctcgg gacctccctc 

40 ttccctgtcc tgaacttcag agtgcggagt 



act gca gta 

Thr Ala Val 
45 -10 

gtc cag gcc 

Val Gin Ala 

gca ggg agg 

50 Ala Gly Arg 



ggc 
Gly 

55 teg 
Ser 

cgc 
Arg 
60 70 
cag 
Gin 



cgc ctg 
Arg Leu 

40 
gtg cgc 
Val Arg 
55 

etc aca 
Leu Thr 

ata gcg 
lie Ala 



gcc 
Ala 

etc 
Leu 

gac 

Asp 

25 

gag 

Glu 

ggt 
Gly 

999 
Gly 

egg 
Arg 



teg etc 
Ser Leu 

ggg cgt 
Gly Arg 
10 

ctg gaa 
Leu Glu 



etc ccc 
Leu Pro 
-5 

cgc ggg 
Arg Gly 



agegggegaa ggccagggtc ccacactcgc 
gtccactgct tgagttccag aggtgggtgc 
cataa atg ggt tec ggc tgg ctt 
Met Gly Ser Gly Trp Leu 
-15 

age ccc ggt aac tec gag eta ccc 
Ser Pro Gly Asn Ser Glu Leu Pro 

1 5 
ggc agg gac tgg gcg egg aac gag 
Gly Arg Asp Trp Ala Arg Asn Glu 



15 20 
aaa cca ccc aga ttg cat tgc agt ggg cga 
Lys Pro Pro Arg Leu His Cys Ser Gly Arg 



30 



35 



gag ccg gtt 
Glu Pro Val 



tec 
Ser 



acg 
Thr 



ggc 
Gly 
90 



cag 
Gin 

egg 

Arg 

75 

ggt 

Gly 



gtg 

Val 

60 

aac 

Asn 

ccg 
Pro 



ccc 

Pro 

45 

etc 

Leu 

ccc 
Pro 

gag 
Glu 



cct aac cac etc ccc gtg ggg etc 
Pro Asn His Leu Pro Val Gly Leu 
50 

age tct get ggg ccc agg agg tgc 
Ser Ser Ala Gly Pro Arg Arg Cys 
65 

gtg cgt ggc ccc cgc egg gtg gaa 
Val Arg Gly Pro Arg Arg Val Glu 

80 85 
get cgt cgc caa gca ggt gac tct 
Ala Arg Arg Gin Ala Gly Asp Ser 
95 100 



60 
120 
173 



221 



269 



317 



365 



413 



461 



509 
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tgc tgaaaaagtg gttggaacac ttaaggaaac ccggccccgc ctgttctttc 562 
Cys 

taggtctttg gagtttggat taatcatttg tgtagcccgt ttggataaac cgaagacttt 622 

attaaatcag cgcgtttaac aggaattccg cagtagtatc cacattagaa tcttgagtct 682 

5 tggagttgaa catattcaca cagacttgcc ttcttcctgt ttagtttatg ccttgtgttc 742 

cgttattgga acgctaagct tgtgggagtt gtttacatcc tactgctcaa ggtcatcgct 802 

aaggtgtgat ttttcacaaa aagaatttgc aacctccggc atgaatgact taagggaagt 862 

ctaatcccgg tttctgattt tttttttttt ttaatttaaa agttaatctt tctgggccgg 922 

gcgcggtggc tcacgcctgt aatcccagca ctttgggagg ccgaggcgga tcacgaggtc 982 

10 aggagttcga gaccagcctg accaacatgg tgaaaccccg tctctactaa aaacacaaaa 1042 

attagccggg cggggtggcg cgcacctgta atcccagctg ctcgggaggc tgaggcagga 1102 

gaatcgcttg aacctgggag gcggggggtt gcagtgagcc gagatctggc cattgcactc 1162 

cagcgtgggc aacagagtga gactccatct caaaaaaaaa ggttaatctt tccaactaga 1222 

ttttcaagga tgaggatttt gttgttgttg ttgttgttgt tctcaaatgt attcccaggg 1282 

15 cttggaacag agcctgacat atactaggca ctcaacaaat atttgttgaa tgattgtaat 1342 

gagtaacacc catttttgca gatctttgtc ttctgagcct agggcatagg tcatcactgc 1402 

aggggtgaga ttgtcaaaat gggagtctac aggtaattta agacttaaat gtttaaagag 1462 

tatgtgctca ttcttcaaca aacttacttt tgttaaatta aaatggtaaa atgtggtgga 1522 

ggggttggaa tatatgtaat tcaagacagt tctgaataca aaaatgtttt actgtctatc 1582 

20 accaccatct ataaatctaa ttcactaagg ataatctgtg taaggtggct ggaaagaacc 1642 

ttgaggagag aggcttattt aagtattggc tcaggaccac acctaaaatt ctcaaaacgt 1702 

tgagattctg ttgttttgtt tttaagcgcc agagacccaa gttgaggaac agcctataaa 1762 

ataactggcc tgtactctta catacatgaa agccatcaaa gacaaagact gaagaagaac 1822 

ttttgcagat taaaggactt taagagacat gatcctgaac caaaaaaaaa aaaaaaaa 1880 

25 

<210> 79 
<211> 584 
<212> DNA 

<213> Homo sapiens 

30 

<220> 
<221> CDS 
<222> 67 . .351 



35 <220> 

<221> sig__peptide 

<222> 67 . . 183 

<223> Von Heijne matrix 

score 10.6473524146908 
40 seq FLCALCSFCPISA/AS 

<400> 79 

ctgattcttc gaaatgatat aagtcctgag ggcttcagtc ccattcgcgc actcatactt 60 
gcaatc atg gac tac age cgt gtc ttt cag ggt gtg ttc ttc acc ttc 108 

45 Met Asp Tyr Ser Arg Val Phe Gin Gly Val Phe Phe Thr Phe 

-35 -30 
aag cat get ttt get gat ggt get tgg gat ctt tea ttt etc tgt get 156 
Lys His Ala Phe Ala Asp Gly Ala Trp Asp Leu Ser Phe Leu Cys Ala 
-25 -20 -15 -10 

50 ctt tgc agt ttc tgc cca ate tea get gee tct ggc aga cct tac agg 204 
Leu Cys Ser Phe Cys Pro lie Ser Ala Ala Ser Gly Arg Pro Tyr Arg 

-5 1 5 

tac ttg gaa ttc tgg aga tta tac ctg tct cct agt tec atg gaa aat 252 
Tyr Leu Glu Phe Trp Arg Leu Tyr Leu Ser Pro Ser Ser Met Glu Asn 

55 10 15 20 

gga gtt caa aaa ttc cac gaa act ttt ttc att gtc ttt ttg ctt ttg 300 
Gly Val Gin Lys Phe His Glu Thr Phe Phe lie Val Phe Leu Leu Leu 

25 30 35 

ttt gat ate gag agg aaa gga aaa agt tct gtt tgt cca ttt tgt tac 348 

60 Phe Asp lie Glu Arg Lys Gly Lys Ser Ser Val Cys Pro Phe Cys Tyr 
40 45 50 55 

aga taaggaaagt ggtttcacaa aggttaagca acttgttcag tgttacccag 401 
Arg 

94 
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caaagagcag aatgattttc aacattcagt ttaaaagtcg gcggggggca gtggctcaca 461 

cctgtaatat cagcaacttg ggaggccaag gtggtacggt cgcttgaagc caaggagttc 521 

aagaccagcc tggtcaacat agcaaaacct tgtctttaca aaaagtaaaa aaaaaaaaaa 581 

aaa ~ ~ ~ 584 

<210> 80 

<211> 1351 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 259. .831 

15 <220> 

<221> sig_peptide 

<222> 259. .375 

<223> Von Heijne matrix 

score 5.809301698725 
20 seq FCVCVIAIGWQA/LI 

<400> 80 

aagctcccgg ccgggctgac tcaagcggag gcgcgcggaa cagtcgccga ggcgattccc 60 
gcccagcagt tcgacagaag tgtacagagg cttctggcaa cacggattgc cgtctacctg 120 
25 atgacctttc tcatcgtgac agtggcctgg gcagcacaca caaggttgtt ccaagttgtt 180 
gggaaaacag acgacacact tgccctgctc aacctggccg catcatggct gtgatgccct 240 
ccctccctcc aggcctgc atg atg acc ate acc ttc ctg cct tac acg ttt 291 

Met Met Thr lie Thr Phe Leu Pro Tyr Thr Phe 
-35 -30 
30 teg tta atg gtg acc ttc cct gat gtg cct ctg ggc ate ttc ttg ttc 339 
Ser Leu Met Val Thr Phe Pro Asp Val Pro Leu Gly lie Phe Leu Phe 

-25 -20 -15 

tgt gtg tgt gtg ate gee ate ggg gtc gtg cag gca ctg att gtg ggg 387 
Cys Val Cys Val lie Ala lie Gly Val Val Gin Ala Leu lie Val Gly 
35 -10 -5 1 

tac gca ttc cac ttc ccg cac ctg ctg age ccg cag ate cag cgc tct 435 
Tyr Ala Phe His Phe Pro His Leu Leu Ser Pro Gin lie Gin Arg Ser 
5 10 15 20 

gee cac agg get ctg tac cga cga cac gtc ctg ggc ate gtc etc caa 483 
40 Ala His Arg Ala Leu Tyr Arg Arg His Val Leu Gly lie Val Leu Gin 

25 30 35 

ggc ccg gee ctg tgc ttt gca gcg gee ate ttc tct etc ttc ttt gtc 531 
Gly Pro Ala Leu Cys Phe Ala Ala Ala lie Phe Ser Leu Phe Phe Val 
40 45 50 

45 ccc ttg tct tac ctg ctg atg gtg act gtc ate etc etc ccc tat gtc 579 
Pro Leu Ser Tyr Leu Leu Met Val Thr Val lie Leu Leu Pro Tyr Val 

55 60 65 

age aag gtc acc ggc tgg tgc aga gac agg etc ctg ggc cac agg gag 627 
Ser Lys Val Thr Gly Trp Cys Arg Asp Arg Leu Leu Gly His Arg Glu 
50 70 75 80 

ccc teg get cac cca gtg gaa gtc ttc teg ttt gac etc cac gag cca 675 
Pro Ser Ala His Pro Val Glu Val Phe Ser Phe Asp Leu His Glu Pro 
85 90 95 100 

etc age aag gag cgc gtg gaa gee ttc age gac gga gtc tac gee ate 723 
55 Leu Ser Lys Glu Arg Val Glu Ala Phe Ser Asp Gly Val Tyr Ala lie 

105 110 115 

gtg gee acg ctt etc ate ctg gac ate tgc ccc tec tgc tec ctt tgg 771 
Val Ala Thr Leu Leu lie Leu Asp lie Cys Pro Ser Cys Ser Leu Trp 
120 125 130 

60 ctg get gtt get tec ttc cag cgt ctg etc etc cgc ggc etc ate tgc 819 
Leu Ala Val Ala Ser Phe Gin Arg Leu Leu Leu Arg Gly Leu lie Cys 

135 140 145 

etc ttc gtc tgt tagagegege gtctcgtctc agtcgtcacg tttttggttt 871 

95 
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Leu Phe Val 

150 
ttgtggggtt 
gtatagtggc 
tgcctcagcc 
ttgcattttt 
cctcaggtga 
gtgcccggcc 
cttgttccct 
tatcctgtac 



Cys 

tttttttttt 
tcaagctcag 
tcccaagtag 
aatagagatg 
tctgcccacc 
atcgtaatgt 
cgtcatagtt 
ttgatattct 



tttttttttg 
ctcactgcaa 
ttgggattac 
aggtttcacc 
tcggcctccc 
ttgaatttgc 
cagcactgtg 
cgagtccaag 



agacagtcct 
cctccgcctc 
aagcacccac 
aagttggcca 
aaagtgctgg 
ttttttacat 
accaccttgg 
tctcctgatg 



gctgtgtcgc 
ccaggttcaa 
caccatgccc 
ggctggtctt 
gattacaggt 
cttccatcct 
ggttagacac 
ctctcaaaaa 



ccaggctgga 
gcaattctcc 
agctaacttt 
gaactcctga 
gtaagccacc 
tttggagtgt 
tatggtttta 
aaaaaaaaaa 



931 
991 
1051 
1111 
1171 
1231 
1291 
1351 



<210> 61 
<211> 720 
<212> DNA 
15 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 111. 



.377 



20 

<220> 

<2 21> sig_peptide 

<222> 111 . .233 

<223> Von Heijne matrix 

25 score 5.264153343 94122 
seq LWFLAQIPSRVAG/SL 

<400> 81 

aaaccgaaac cagcgctcca aacaattggg acccgggatc ttatgccagt gaggctgtgc 60 
30 tgcggctgag cgggcctccc atccctctta aaagagttag gcatttagcc atg cct 116 

Met Pro 
-40 

ccc acc egg gac cct ttc cag cag cct aca tta gat aac gat gat tec 164 
Pro Thr Arg Asp Pro Phe Gin Gin Pro Thr Leu Asp Asn Asp Asp Ser 

35 ~ -35 -30 -25 

tac tta gga gaa ctg egg get tec aag gta ctg tgg ttt ctt gcg cag 212 
Tyr Leu Gly Glu Leu Arg Ala Ser Lys Val Leu Trp Phe Leu Ala Gin 

-20 -15 -10 

att ccc agt agg gtc gee ggt agt ctt ctt tct gtc tgt gtg atg age 260 

40 He Pro Ser Arg Val Ala Gly Ser Leu Leu Ser Val Cys Val Met Ser 
-5 1 5 

aga gat ggt aac ata aag gac tct ggt gaa gac act cag teg ggt acc 308 
Arg Asp Gly Asn He Lys Asp Ser Gly Glu Asp Thr Gin Ser Gly Thr 
10 15 20 25 

45 agg gaa gtc tgt ttt ctg cct gee tec eta tct cca tat tea agt egg 356 
Arg Glu Val Cys Phe Leu Pro Ala Ser Leu Ser Pro Tyr Ser Ser Arg 

3 0 3 5 4 0 

eta acg ttt cag agg cgt ttt tgagcagagg aaagtagagt tctagtctag 407 
Leu Thr Phe Gin Arg Arg Phe 

50 45 

aggaacaagg ggctctggca gctcaaatca attaaccaag atccaattcc ctggagaatt 467 
tttaacccct cccactccac ccatcacttg cctggctaac atcagacact ggatcaaccc 527 
taaaaaggag tccatccaca gcatccaagg atccatagtg tcccctcaca ctgcagccac 587 
caatggaggc tactcccgaa agaaagatgg tggcttcttc tccacctagt gttgacagat 647 
55 ccctgaacta attatagtga aacatactgc ggcccacttc cattaaatag atttgtgcaa 707 
aaaaaaaaaa aaa 720 

<210> 82 

<211> 1029 

60 <212> DNA 

<213> Homo sapiens 

<220> 



96 
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<221> CDS 
<222> 223. 



,432 



10 



<220> 

<221> sig_peptide 

<222> 223 . . 336 

<223> Von Heijne matrix 

score 4.17665217008018 
seq LVNVLFFFTPLMT/LV 



<400> 82 

gtttttgtat tggaagcagt tgtttggcct tgctgagcaa acgtctatgc cttctccatt 

acatccaaag gagaatagcc ccatgtgaag aatggaatca gtagatgttt ggtcgctgta 

ccatatccac tcctaggata caacaagagc aagcccaatt ctcttggtgg tgtgggcagt 

15 cggcttgcac cacgtaccta tctcagctct ttttggaagc tt atg tec tec cca 

Met Ser Ser Pro 
-35 

caa ctt cca get ttc tta tgg gac aag ggt aca etc acc act gee ata 

Gin Leu Pro Ala Phe Leu Trp Asp Lys Gly Thr Leu Thr Thr Ala lie 

20 -30 -25 -20 

tct aat cct get tgc ctg gta aat gtt etc ttc ttc ttt aca ccc ctg 

Ser Asn Pro Ala Cys Leu Val Asn Val Leu Phe Phe Phe Thr Pro Leu 

-15 -10 -5 

atg act ctg gtc act eta etc ate ctg gtc tgg aaa gta acc aaa gac 

25 Met Thr Leu Val Thr Leu Leu lie Leu Val Trp Lys Val Thr Lys Asp 

15 10 

aaa age aac aag aac aga gag aca cac cca aga aag gag gca aca tgg 

Lys Ser Asn Lys Asn Arg Glu Thr His Pro Arg Lys Glu Ala Thr Trp 
15 20 25 30 

30 ctg cca taaagatctg gatctcttgg tggggactcc actgaggtga agacctgatt 
Leu Pro 

gtacaagaga ggcacggcca ctggagctgt ctcagagccc agagecaggg gagecagage 

tgetttagee accctgttcc tccattgcca gatgtccccc caggcctcat ttccttcctc 

tgccaccatc cctcttataa tgcactcctc ctgcggttct ttggcttgtc ccagcttctg 

35 agtttgaatg tctttttttt tttttttttt tttttgkgga tcttcaagac tgaaatagta 

aatggctctt gatttctgea ctaacagagg aaagaaacaa gtacatggaa aagtaaaaat 

tgattacaaa gectaaattt tcctctataa attgggcatg tgctgactgt gggatattga 

aattattggg agctcacagc atctcaagtt atataatgaa gctattctgg aagctcattt 

ccagaagatc cttaaaatga aatggctcac tetctgetga attaatttgg agcaagttaa 

40 ctcctttttc aaatgaaatc caaattaaag aggcagtttt ttttgaaaaa ccaaaaaaaa 
aaaaaaa 



60 
120 
180 
234 



282 



330 



378 



426 



482 

542 
602 
662 
722 
782 
842 
902 
962 
1022 
1029 



<210> 83 

<211> 1788 

45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 769. .1272 



<220> 

<221> sig_peptide 
<222> 769. .843 
55 <223> Von Heijne matrix 

score 5.65786415517206 

seq AAHLLWILPANA/AL 

<400> 83 

60 ttgggcagaa aaattcaagc aggagattgt atttcttttg gagttgtacg atttccttta 

ttatttgaac tgcagtaaag aaagctggga tgggctcctc tagggatact tccagatccc 

tgggcggttg tagccctggc tcctctttaa atggatttgg tttcaaagac gatcatctcc 

gtcttctegg atgtcatagt gccactgatc atctccagct cctggccacc ctgggctttc 

97 



60 
120 
180 
240 
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tccactttgg cctctatgtt ttgcttctcc accgtcttag ccacgatatc tacctctctg 
tcatgtgatg tgacccttgt ttttgaacca ggagtggccc tgaggctcct taaaaaagag 
ctgatcttac tggctttctt ttgtaaagct cctgtgctag atgcagattg gttcttcccc 
agttcctgag ttgttctcga ctcctttttg gtggagctgt gggtggagct cttgcgagag 
5 gagccatgtc gcttgcccct tacgttgctg tacccttttt cttttttgtc atctctcgtg 
tttttatggc cagatgcgga ccggtgggaa gacgctttct gattcttgtc ccccgctctc 
ctgcggtgac tttcacctgc cttgcggtga tgagaacttt tcctactggg atgtctgtcc 
ttcttttctc ttctttcctt gttttcattc cagacttcag cactgggctg ggaaactttc 
tggcttccat ctcgttcact catgtagcct tcgctttgca aggtggag atg agg ggt 
10 Met Arg Gly 

-25 

ccc act get ggt cct tea gtt ctt tct get gca cac ttg ctg gtc gta 
Pro Thr Ala Gly Pro Ser Val Leu Ser Ala Ala His Leu Leu Val Val 
-20 -15 -10 

15 ata ctg cct gca aac gee gca etc aag ctg ctg tct tgg gag aga ctg 
lie Leu Pro Ala Asn Ala Ala Leu Lys Leu Leu Ser Trp Glu Arg Leu 

-5 15 10 

gcg gee ccc gec ate gag gtg gaa gta cct tec aag gag gtg ctt gca 
Ala Ala Pro Ala lie Glu Val Glu Val Pro Ser Lys Glu Val Leu Ala 
20 15 20 25 

gca ccc ace aag gee aag eta ata ccc tct gag gat atg ttg gca gca 
Ala Pro Thr Lys Ala Lys Leu lie Pro Ser Glu Asp Met Leu Ala Ala 

30 35 40 

cct gee atg gac ttg ctg gat tea ttt tct cct gga ttt ttg ata get 
25 Pro Ala Met Asp Leu Leu Asp Ser Phe Ser Pro Gly Phe Leu lie Ala 
45 50 55 

get ccc gee age get gtg ate act tgg cct ggg cct gca gat ttg gtt 
Ala Pro Ala Ser Ala Val lie Thr Trp Pro Gly Pro Ala Asp Leu Val 
60 65 70 

30 gtt get atg etc ata gca cct gtt gca gga etc att get gee cct get 
Val Ala Met Leu lie Ala Pro Val Ala Gly Leu lie Ala Ala Pro Ala 
75 80 85 90 

att gee aca tct gtt eta ggt cct gtt get gtt cct gee act gee atg 
lie Ala Thr Ser Val Leu Gly Pro Val Ala Val Pro Ala Thr Ala Met 
35 95 100 105 

cca cct get gtc ctt get get cct cct tea gca gee cct gga gtg etc 
Pro Pro Ala Val Leu Ala Ala Pro Pro Ser Ala Ala Pro Gly Val Leu 

110 115 120 

gtg gat gga gaa gee gca eta gee gtt ccg tgg gag gca tgt tgg att 
40 Val Asp Gly Glu Ala Ala Leu Ala Val Pro Trp Glu Ala Cys Trp lie 
125 130 135 

ccc tct ccc cca gca taagcagaag aggtggctgc agatacatca caaggcttgt 
Pro Ser Pro Pro Ala 
140 

45 agageccagt ctcactctga tccccttctc tgtggagctc tgeagectat accaagggga 
agagaaacag atgagattga gatgactgaa agggagatca gaactttcta ctcctctctt 
atcctggagt taattcaagg gcttataatt agaagaacct gggtcgggtg tggtggctca 
cgcctgtaat cccaacactt tgggaggeca aggagggcag ategcttgag gecaggagtt 
caagaccagc cttgccaaca tagcaaaacc ccgactctac taaaaataca aaaaattagc 

50 tggacaggat ggcgcatgcc tgtaatccca gctactcagt aggctgaggt aggagtatcg 
cttgaactcg gatggeggag gctgcagtga gecaagactg cgccactcca ctgcactcca 
gcctgggcaa cagagtgaga cactgtttaa aaaaaagaaa gaaaaaaaaa aaaaaa 



300 
360 
420 
480 
540 
600 
660 
720 
777 



825 



873 



921 



969 



1017 



1065 



1113 



1161 



1209 



1257 



1312 



1372 
1432 
1492 
1552 
1612 
1672 
1732 
1788 



<210> 84 

55 <211> 805 

<212> DNA 

<213> Homo sapiens 

<220> 

60 <221> CDS 

<222> 30. . 527 

<220> 



98 
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<221> sig_peptide 

<222> 30. .74 

<223> Von Heijne matrix 

score 8.68924532952647 
5 seq PLLI ICLLPAIEG/KN 

<400> 84 

actggggcac agtaggagga acccagaag atg ctg cct etc ctg ate ate tgt 53 

Met Leu Pro Leu Leu lie lie Cys 

10 -15 -10 

etc ctg cct gee att gaa ggg aag aac tgc etc cgc tgc tgg cca gaa 101 
Leu Leu Pro Ala lie Glu Gly Lys Asn Cys Leu Arg Cys Trp Pro Glu 

-5 15 
ctg tct gec ttg ata gac tat gac ctg cag ate etc tgg gtg ace cca 149 

15 Leu Ser Ala Leu lie Asp Tyr Asp Leu Gin lie Leu Trp Val Thr Pro 
10 15 20 25 

ggg cca ccc aca gaa ctt tct caa aat cgt gac cat ttg gaa gaa gaa 197 
Gly Pro Pro Thr Glu Leu Ser Gin Asn Arg Asp His Leu Glu Glu Glu 
30 35 40 

20 aca gec aaa ttc ttc act caa gta cac caa gec att aaa acg tta cga 245 
Thr Ala Lys Phe Phe Thr Gin Val His Gin Ala lie Lys Thr Leu Arg 

45 50 55 

gat gat aaa aca gta ctt ctg gaa gag ate tac acg cac aag aat etc 293 
Asp Asp Lys Thr Val Leu Leu Glu Glu lie Tyr Thr His Lys Asn Leu 

25 60 65 70 

ttt act gag agg ctg aat aag ata tct gat ggg ctg aag gag aag gac 341 
Phe Thr Glu Arg Leu Asn Lys lie Ser Asp Gly Leu Lys Glu Lys Asp 

75 80 85 

ata cag tec aca ctg aag gtc acc age tgt get gac tgc agg act cac 389 

30 lie Gin Ser Thr Leu Lys Val Thr Ser Cys Ala Asp Cys Arg Thr His 
90 95 100 105 

ttc etc tec tgc aat gac ccc act ttc tgc cca gee agg aac egg egg 437 
Phe Leu Ser Cys Asn Asp Pro Thr Phe Cys Pro Ala Arg Asn Arg Arg 
110 115 120 

35 acc tec ctg tgg get gtg agt etc age agt get eta etc ctg gec ata 485 
Thr Ser Leu Trp Ala Val Ser Leu Ser Ser Ala Leu Leu Leu Ala lie 

125 130 135 

get gga gat gtt tct ttt act ggc aaa gga aga agg agg cag 527 
Ala Gly Asp Val Ser Phe Thr Gly Lys Gly Arg Arg Arg Gin 

40 140 145 150 

taaagcagga acagggcagc ccgcatgtct tccagaagtg aacagaggee gcagctacca 587 
ccgtcacaaa gttcactcat ctctgggtcc cggtgacccc atccccccat accctccatc 647 
ctgggtcctg gggccccaaa gctctgaggc ctaggagact gcgctgtctc gtggtttgcc 707 
tactcctaca cctttgtaaa gagtctcttc attaaaaccc ctcttcataa aaaaaaaaaa 767 

45 aaaaaaaaaa aaaaaaaaaa aataaaaaaa aaaaaaaa 805 

<210> 85 
<211> 814 
<212> DNA 
50 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 39. .506 

55 

<220> 

<221> sig_j?eptide 
<222> 39 . . 83 
<223> Von Heijne matrix 
60 score 5.91494342964539 

seq I LMLTF I I CGLLT / RV 

<400> 85 

99 
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10 



15 



20 



25 



attcctcagg acacagagct tcctctctcc caggagcc atg aat ate ctg atg ctg 

Met Asn lie Leu Met Leu 
-15 -10 
acc ttc att ate tgt ggg ttg eta act egg gtg acc aaa ggt age ttt 
Thr Phe lie lie Cys Gly Leu Leu Thr Arg Val Thr Lys Gly Ser Phe 

-5 1 5 

gaa ccc caa aaa tgt tgg aag aat aat gta gga cat tgc aga aga cga 
Glu Pro Gin Lys Cys Trp Lys Asn Asn Val Gly His Cys Arg Arg Arg 

10 15 20 

tgt tta gat act gaa agg tac ata ctt ctt tgt agg aac aag eta tea 
Cys Leu Asp Thr Glu Arg Tyr lie Leu Leu Cys Arg Asn Lys Leu Ser 

25 30 35 

tgc tgc att tct ata ata tea cat gaa tat act cga cga cca gca ttt 
Cys Cys lie Ser lie lie Ser His Glu Tyr Thr Arg Arg Pro Ala Phe 
40 45 50 55 

cct gtg att cac eta gag gat ata aca ttg gat tat agt gat gtg gac 
Pro Val lie His Leu Glu Asp lie Thr Leu Asp Tyr Ser Asp Val Asp 

60 65 70 

tct ttt act ggt tec cca gta tct atg ttg aat gat ctg ata aca ttt 
Ser Phe Thr Gly Ser Pro Val Ser Met Leu Asn Asp Leu lie Thr Phe 

75 80 85 

gac aca act aaa ttt gga gaa acc atg aca cct gag acc aat act cct 
Asp Thr Thr Lys Phe Gly Glu Thr Met Thr Pro Glu Thr Asn Thr Pro 



90 



95 



100 



30 120 



35 



gag act act atg cca cca tec gag gee act act ccc gag act act atg 
Glu Thr Thr Met Pro Pro Ser Glu Ala Thr Thr Pro Glu Thr Thr Met 

105 110 115 

cca cca tct gag act get act tec gag act atg cca cca cct tct cag 
Pro Pro Ser Glu Thr Ala Thr Ser Glu Thr Met Pro Pro Pro Ser Gin 

125 130 135 

aca get ctt act cat aat taattaacat ttacttctgg tatggaacaa 
Thr Ala Leu Thr His Asn 
140 

ctagaaatac tgctggaaat aatatccaaa gagctgattc taccaatcca atttcaccag 
gaaaattcca tcagggattg gatgaccatg gggatggaca taattgetae taccaacaca 
acagecaaga gagttgeett acaattagaa atgtgtagac agaaatgtat agaagataca 
aggattctct taattggact taaattcttt atctgtcttc ctccgatgta ctcaaatata 
tgagctaatt tttgtcttaa gtgaaaaaaa aaaaaaaa 



56 



104 



152 



200 



248 



296 



344 



392 



440 



488 



536 



596 
656 
716 
776 
814 



40 <210> 86 

<211> 598 

<212> DNA 

<213> Homo sapiens 

45 <220> 

<221> CDS 

<222> 115. .429 



50 



55 



60 



<220> 

<221> sig_ peptide 

<222> 115. .210 

<223> Von Heijne matrix 

score 8.2583062681354 

seq LVAAMVLLS WFC/LY 

<400> 86 

attctaccag ctctggctga gectgagett ccaaaagtga gctgagctgt tcaaccttgg 
atcttaatta ctcctagcag ggataattag gtccctcttt ctcagattac aggc atg 

Met 

gca aag atg ttt gat etc agg acg aag ate atg ate ggc ate gaa age 
Ala Lys Met Phe Asp Leu Arg Thr Lys lie Met lie Gly lie Glu Ser 

-30 -25 -20 

age tta ctg gtt gee gcg atg gtg etc eta agt gtt gtg ttc tgt ctt 

100 



60 
117 

165 



213 
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Ser Leu Leu Val Ala Ala Met Val Leu Leu Ser Val Val Phe Cys Leu 
-15 -10 -5 1 

tac ttc aaa gta get aag gca eta aaa get gca aag gac cct gat get 261 
Tyr Phe Lys Val Ala Lys Ala Leu Lys Ala Ala Lys Asp Pro Asp Ala 
5 5 10 15 

gtg get gta aaa aat cac aac cca gac aag gtg tgt tgg gee acg aac 309 
Val Ala Val Lys Asn His Asn Pro Asp Lys Val Cys Trp Ala Thr Asn 

20 25 30 

age cag gee aaa gee ace ace atg gag tct tgt cca tct etc cag tgc 357 
10 Ser Gin Ala Lys Ala Thr Thr Met Glu Ser Cys Pro Ser Leu Gin Cys 
35 40 45 

tgt gaa ggt tgt aga atg cat gee agt tct gat tec ctg cca cct tgc 405 
Cys Glu Gly Cys Arg Met His Ala Ser Ser Asp Ser Leu Pro Pro Cys 
50 55 60 65 

15 tgt tgt gac ata aat gag ggc etc tgacttggga aagctgggca caaaaatctt 459 
Cys Cys Asp lie Asn Glu Gly Leu 
70 

catgagcaat atttctttct taatagaatg ttttattatt caagtcaagt tctagagtgt 519 
ttacatacta ttatataatg tacagtgtta ttttctgtac ttctgaataa atgtgcaata 579 
20 ttgcaaaaaa aaaaaaaaa 598 

<210> 87 

<211> 699 

<212> DNA 

25 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 332 . . 574 

30 

<220> 

<221> sig^peptide 
<222> 332 . .412 
<223> Von Heijne matrix 
35 score 7.964 91294552426 

seq ILGLFCCLPLAIP/AV 

<400> 87 

aatcccctgt ggttggtgat caaggaagag catagtgeca gacctaggtg ccctcctggg 60 
40 aatgttccag gagggcagga gtaggaggag gagtgttaga gtagagggga aatgatgaga 120 
gcagaaagga gagtctcget ctgtcaccca ggctggagtg cagtggcagg atcttggctc 180 
acttcaacct ccacctcccg agttctgcct cagcctccca agtagctggg attacaggtc 240 
cagtcactcc aegcttgeag agtccaatta acaagagcaa gttctggtag aaagaaggtg 300 
actttattcc agagctcagg tgtttgaact g atg tct gat gag gat gaa tec 352 
45 Met Ser Asp Glu Asp Glu Ser 

-25 

age gac tac etc tgc ctg tec ate ctg ggc etc ttc tgt tgc ctt ccc 400 
Ser Asp Tyr Leu Cys Leu Ser lie Leu Gly Leu Phe Cys Cys Leu Pro 
-20 -15 -10 -5 

50 eta gee ate cca gee gtg ate ttt tct tgc ctg aca aag aac tac aat 448 
Leu Ala lie Pro Ala Val lie Phe Ser Cys Leu Thr Lys Asn Tyr Asn 

15 10 
aaa tec agt gac tat gag ctg gca gee aag acc tec aaa caa gec tac 496 
Lys Ser Ser Asp Tyr Glu Leu Ala Ala Lys Thr Ser Lys Gin Ala Tyr 

55 15 20 25 

tac tgg gee ate gcg age ate act gtg gga ate tta ggt acc ate ttg 544 
Tyr Trp Ala lie Ala Ser lie Thr Val Gly lie Leu Gly Thr lie Leu 

30 35 40 

tac acc tac ctg ata tac tta ctt aga ttg taaactgett cccagctctt 594 

60 Tyr Thr Tyr Leu lie Tyr Leu Leu Arg Leu 
45 50 

gaacaaacca ccaaatatac accacagtgc aatttaaaaa aaaaaaaaaa aaaaaaaaaa 654 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa gaaaaaaaaa aaaaa 699 

101 
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<210> 88 

<211> 905 

<212> DNA 

5 <213> Homo sapiens 



10 



15 



<220> 
<221> CDS 
<222> 133 . .417 

<220> 

<221> sig_peptide 

<222> 133 . .213 

<223> Von Heijne matrix 

score 11.106948594338 
seq LTSLLILVTLISA/FV 



<400> 88 

atttccaggg agctgaggag ctgagggcag agctagcttt tggttatttg ggatgttatt 
20 gccagtttcc tcccagggcc attgttacca cctgatcatt tgagttttag tttctctagc 
agatgctgac ta atg act gac cag gat cga ate ate aat tta gtt gtt ggc 
Met Thr Asp Gin Asp Arg lie He Asn Leu Val Val Gly 
-25 -20 -15 

age tta aca tec tta ttg att eta gta acg ctg ata agt get ttt gtt 
25 Ser Leu Thr Ser Leu Leu He Leu Val Thr Leu He Ser Ala Phe Val 

-10 -5 1 

ttc cct caa eta cct cca aaa ccg ttg aat ata ttc ttt get gtc tgc 
Phe Pro Gin Leu Pro Pro Lys Pro Leu Asn He Phe Phe Ala Val Cys 
5 10 15 

30 ate tct ttg agt agt att act gee tgc ata ate tac tgg tat cga caa 
He Ser Leu Ser Ser He Thr Ala Cys He He Tyr Trp Tyr Arg Gin 

20 25 30 

gga gac tta gaa ccg aaa ttt aga aag eta att tac tat ate ata ttt 
Gly Asp Leu Glu Pro Lys Phe Arg Lys Leu He Tyr Tyr He He Phe 
35 35 40 45 50 

tct ate ate atg ttg tgt ata tgt gca aac ctg tac ttc cat gat gtg 
Ser He He Met Leu Cys He Cys Ala Asn Leu Tyr Phe His Asp Val 

55 60 65 

gga agg tgaggctgee aaggagaagt acttaccagg actcttcaaa atgatacatt 
40 Gly Arg 

aggacagtga gtaatttttg gataaggtat gctgaagaat ctcctgcaga agtctgatac 
atgattttca tgttaattgt aaatgttaat tccctcttgc aagggagaca tatcctagat 
cactttgett tttctttaag gagctgatgt tgcacctaaa cattccaacc cttaaagcta 
aaacagcaca aaaaaatttc acttttgaaa tgaaattttt ataattgtat ggcaaaaggc 
45 tatgtaaaaa caaatcttgc atcttaagac aaatattctt ttatttctgt taaactgaat 
atacaattgt tccctaggca accaactttt gcttataact acaatttaat ttcacgttga 
caaaacacag tgaaaagaca actttgtgaa gatctaatta caataataaa taaaataatt 
tacaaaaaaa aaaaaaaa 



60 
120 
171 



219 



267 



315 



363 



411 



467 

527 
587 
647 
707 
767 
827 
887 
905 



50 <210> 89 

<211> 514 

<212> DNA 

<213> Homo sapiens 



55 <220> 

<221> CDS 
<222> 113. 



364 



<220> 

60 <221> sig_peptide 
<222> 113 . . 172 
<223> Von Heijne matrix 

score 4.37180298395146 



102 
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seq SLLLSLPPHQGLT/FS 
<400> 89 

ttttttacat ggtgttccca cagctgggag gacacccaca tggtcggcgt gcaggatatt 60 
5 tcgctggacc ctagaaaagc caccacgacc tgtgggccat gatgctaccc ca atg get 118 

Met Ala 
-20 

get get get gtt cct tct ctt ctt ctt tct ctt cct cct cac cag ggg 166 
Ala Ala Ala Val Pro Ser Leu Leu Leu Ser Leu Pro Pro His Gin Gly 

10 -15 -10 -5 

etc act ttc tec aac aaa ata caa cct ttt gga get caa gga gtc ttg 214 
Leu Thr Phe Ser Asn Lys lie Gin Pro Phe Gly Ala Gin Gly Val Leu 

1 5 10 

cat ccg gaa cca gga ctg cga gac tgg ctg ctg cca acg tgc tec aga 262 

15 His Pro Glu Pro Gly Leu Arg Asp Trp Leu Leu Pro Thr Cys Ser Arg 
15 20 25 30 

caa ttg cga gtc gca ctg ccg gag aag ggg tec gag ggc agt ctg tgt 310 
Gin Leu Arg Val Ala Leu Pro Glu Lys Gly Ser Glu Gly Ser Leu Cys 
35 40 45 

20 caa acg cag ctg cca get act cca tgc ttc ctg cct teg aat acg gtg 358 
Gin Thr Gin Leu Pro Ala Thr Pro Cys Phe Leu Pro Ser Asn Thr Val 

50 55 60 

aga acg tgaagtcatg agetgetget aaggcatgtg gcaaccttga agagaaggtc 414 
Arg Thr 

25 aagagctacc agccaccaaa agaatgecag cacttcctgt gtctttgett tggattcatg 474 
agaaatatac gttcctattt gcttcaaaaa aaaaaaaaaa 514 

<210> 90 

<211> 518 

30 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
35 <222> 9. .380 

<220> 

<221> sig_ peptide 
<222> 9. . 104 
40 <223> Von Heijne matrix 

score 4.73369226787171 
seq AVFAVLFVFFLFA/ML 

<400> 90 

45 acatccta atg gtg gtg gtt gaa cca gga gee agt tta ttc cca aat ggt 50 
Met Val Val Val Glu Pro Gly Ala Ser Leu Phe Pro Asn Gly 
-30 -25 -20 

gtt cct tgg etc tat get gtg ttt get gtg ctt ttt gta ttt ttt ctt 98 
Val Pro Trp Leu Tyr Ala Val Phe Ala Val Leu Phe Val Phe Phe Leu 
50 -15 -10 -5 

ttt gee atg tta tct ccc ttt tta ctt gag ata gac cag cac ata aag 146 
Phe Ala Met Leu Ser Pro Phe Leu Leu Glu lie Asp Gin His lie Lys 

15 10 
aaa ttc ttg ate aga tgc agg tat tct ctg cat aac act gtg cat aag 194 
55 Lys Phe Leu lie Arg Cys Arg Tyr Ser Leu His Asn Thr Val His Lys 
15 20 25 30 

gac aaa aaa aac agt gag ata aag atg gac cat eta gaa agg cca ggc 242 
Asp Lys Lys Asn Ser Glu lie Lys Met Asp His Leu Glu Arg Pro Gly 
35 40 45 

60 tgt cca ctg gag tea cca agg aga gga gtt ctg gga ggg aag aaa aat 290 
Cys Pro Leu Glu Ser Pro Arg Arg Gly Val Leu Gly Gly Lys Lys Asn 

50 55 60 

ggg atg gga aac gac cca tta eta ttt gtg aaa gtg aca aaa gaa ccc 338 

103 
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Gly Met Gly Asn Asp Pro Leu Leu Phe Val Lys Val Thr Lys Glu Pro 

65 70 75 

agg gat tct gag get gaa ate tat ace cct ggg cct tea gtt 380 
Arg Asp Ser Glu Ala Glu lie Tyr Thr Pro Gly Pro Ser Val 
5 80 85 90 

tgagagtcat ttagectata tggaattacc tgtgacatta cattccagag agatgagaaa 440 
ttctgagacc cttattatcg atgtttatat tgaaaaaatg gtaataaata ttttgagact 500 
cccaaaaaaa aaaaaaaa 518 

10 <210> 91 

<211> 808 

<212> DNA 

<213> Homo sapiens 

15 <220> 

<221> CDS 
<222> 155 . . 340 

<220> 

20 <221> sig_peptide 
<222> 155 . .292 
<223> Von Heijne matrix 

score 8.64329745298384 

seq AVLLL I L FA I VFG / LL 

25 

<400> 91 

cttttcctct caacagttgc ttctttgagt cagggtgcag ctctggtcac ctggcggcct 60 
cttcagctca gccctccaca aagtgtgagc ctgaaggacc accctgaatt geccttgtag 120 
gacccagaac agctaccagc agaatcagat tctc atg gac caa ctg gta ttc aaa 175 

30 Met Asp Gin Leu Val Phe Lys 

-45 -40 
gag aca ate tgg aat gat gcg ttc tgg cag aac ccc tgg gac cag ggg 223 
Glu Thr lie Trp Asn Asp Ala Phe Trp Gin Asn Pro Trp Asp Gin Gly 
-35 -30 -25 

35 ggc ctg gca gtg att ate tta ttc ate acc get gtc ctg ctt etc ate 271 
Gly Leu Ala Val lie lie Leu- Phe lie Thr Ala Val Leu Leu Leu lie 

-20 -15 -10 

tta ttt gec ate gtg ttt ggt tta etc act tec aca gaa aac act cag 319 
Leu Phe Ala lie Val Phe Gly Leu Leu Thr Ser Thr Glu Asn Thr Gin 

40-5 1 5 

tgt gaa gcg ggt gaa gag gag tgacctgact tgctggggac tgagatggca 370 
Cys Glu Ala Gly Glu Glu Glu 
10 15 

gcaggggagg cgagctgacc tgcccccatt ccagtggtgg gccccttcgc ggttccctct 430 
45 ggctcagggg ccaagccctg gtgtcttcct ttcccaccag gaaaaagtct agtaaaatac 490 
tgtatctggc ttagggttgg tcagactagt aagatgggga ggctggtctg agaccaattc 550 
tggctccttg accctattgt ttttagggtt ccccgaccag aaccctaaaa gcacatggag 610 
aggatggctc cactgcctca ggtggaagga gctatggcta acaaggttct ctaacaggct 670 
cacaggccca gecagcaatt tcacaaatcc ttgacagaga aagacacaac caaatgaaat 730 
50 aaaaattcct tttcaaatct gctaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 790 
aaaaaaaaaa aaaaaaaa 808 

<210> 92 
<211> 737 
55 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
60 <222> 185. .634 

<220> 

<221> sig_peptide 

104 
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<222> 185 . . 253 

<223> Von Heijne matrix 

score 9.49395175807817 

seq SLLFICFFGESFC/IC 

5 

<400> 92 

atattttgct gactggcaag gttatatgaa gtgcttttat tgaagcacca ttttaactaa 60 
tagctcctgg tattttctgc ttcccttcgt agggaattta gttattttat tttattattt 120 
agctaattta gctattttaa aatagctaaa ttttagctac ttttttttca attgacaaag 180 
10 aagg atg tct aat caa aga eta ccg ctg att ttt tct ctg ttg ttt ate 229 
Met Ser Asn Gin Arg Leu Pro Leu lie Phe Ser Leu Leu Phe lie 
-20 -15 -10 

tgc ttc ttc ggg gag agt ttc tgc att tgt gat gga act gtc tgg aca 277 
Cys Phe Phe Gly Glu Ser Phe Cys lie Cys Asp Gly Thr Val Trp Thr 
15-5 15 

aag gtt gga tgg gag att ctt cca gaa gaa gta cat tat tgg aaa ggt 325 
Lys Val Gly Trp Glu lie Leu Pro Glu Glu Val His Tyr Trp Lys Gly 

10 15 20 

tgt tta tat etc att tat aat tta tta caa get gtc ttc ttc gtc tta 373 
20 Cys Leu Tyr Leu lie Tyr Asn Leu Leu Gin Ala Val Phe Phe Val Leu 
25 30 35 40 

ttt gtt ttg tct gtg cat tac ctg tgg aag aaa tgg aag aaa cac caa 421 
Phe Val Leu Ser Val His Tyr Leu Trp Lys Lys Trp Lys Lys His Gin 
45 50 55 

25 aaa aag ctg aaa aag caa gee tec tta gaa aaa cct ggt aat gat eta 469 
Lys Lys Leu Lys Lys Gin Ala Ser Leu Glu Lys Pro Gly Asn Asp Leu 

60 65 70 

gaa age cca ttg ate aac aac att gac caa aca etc cac aga gtg gca 517 
Glu Ser Pro Leu lie Asn Asn lie Asp Gin Thr Leu His Arg Val Ala 
30 75 80 85 

acc aca gca tea gtg ata tac aag ate tgg gag cac agg tct cac cat 565 
Thr Thr Ala Ser Val lie Tyr Lys lie Trp Glu His Arg Ser His His 

90 95 100 

cct tec tct aag aaa att aag cac tgc aaa tta aag aag aag agt aaa 613 
35 Pro Ser Ser Lys Lys lie Lys His Cys Lys Leu Lys Lys Lys Ser Lys 
105 110 115 120 

gaa gaa gga gee aga aga tac taaataaatg catatgeaaa tgtagcttag 664 
Glu Glu Gly Ala Arg Arg Tyr 
125 

40 tcaattatag atatcacaaa agaaatctat catctaagga ttaaaaattg ttctttggaa 724 
aaaaaaaaaa aaa 737 

<210> 93 
<211> 728 
45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 53 . . 646 

<220> 

<221> sig_peptide 
<222> 53 . . 91 
55 <223> Von Heijne matrix 

score 4.95353272042967 

seq MLLGRLTSQLLRA/VP 

<400> 93 

60 aatttgagee gegtcgaget cccctgggac ctgtggccgc cgcccacaga cc atg etc 58 

Met Leu 

ctg ggg cgc ctg act tec cag ctg ttg agg gee gtt cct tgg gca ggc 106 
Leu Gly Arg Leu Thr Ser Gin Leu Leu Arg Ala Val Pro Trp Ala Gly 

105 
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-10 -5 15 

ggc cgc ccg cct tgg ccc gtc tct gga gtg ctg ggc age egg gtc tgc 154 
Gly Arg Pro Pro Trp Pro Val Ser Gly Val Leu Gly Ser Arg Val Cys 
10 15 20 

5 ggg ccc ctt tac age aca teg ccg gee ggc cca ggt agg gcg gec tct 202 
Gly Pro Leu Tyr Ser Thr Ser Pro Ala Gly Pro Gly Arg Ala Ala Ser 

25 30 35 

etc cct cgc aag ggg gee cag ctg gag ctg gag gag atg gtc ccc agg 250 
Leu Pro Arg Lys Gly Ala Gin Leu Glu Leu Glu Glu Met Val Pro Arg 
10 40 45 50 

aag atg tec gtc age ccc ctg gag age tgg etc acg gee cgc tgc ttc 298 
Lys Met Ser Val Ser Pro Leu Glu Ser Trp Leu Thr Ala Arg Cys Phe 

55 60 65 

ctg ccc aga ctg gat ace ggg ace gca ggg act gtg get cca ccg caa 346 
15 Leu Pro Arg Leu Asp Thr Gly Thr Ala Gly Thr Val Ala Pro Pro Gin 
70 75 80 85 

tec tac cag tgt ccg ccc age cag ata ggg gaa ggg gee gag cag ggg 394 
Ser Tyr Gin Cys Pro Pro Ser Gin lie Gly Glu Gly Ala Glu Gin Gly 
90 95 100 

20 gat gaa ggc gtc gcg gat gcg cct caa att cag tgc aaa aac gtg ctg 442 
Asp Glu Gly Val Ala Asp Ala Pro Gin lie Gin Cys Lys Asn Val Leu 

105 110 115 

aag ate cgc egg egg aag atg aac cac cac aag tac egg aag ctg gtg 4 90 

Lys lie Arg Arg Arg Lys Met Asn His His Lys Tyr Arg Lys Leu Val 
25 120 125 130 

aag aag acg egg ttc ctg egg agg aag gtc cag gag gga cgc ctg aga 538 
Lys Lys Thr Arg Phe Leu Arg Arg Lys Val Gin Glu Gly Arg Leu Arg 

135 140 145 

cgc aag cag ate aag ttc gag aaa gac ctg agg cgc ate tgg ctg aag 586 
30 Arg Lys Gin lie Lys Phe Glu Lys Asp Leu Arg Arg lie Trp Leu Lys 
150 155 160 165 

gcg ggg eta aag gaa gee ccc gaa ggc tgg cag ace ccc aag ate tac 634 
Ala Gly Leu Lys Glu Ala Pro Glu Gly Trp Gin Thr Pro Lys lie Tyr 
170 175 180 

35 ctg egg ggc aaa tgagtctggc gccgcccttc ccgcccgttg ctgctgtgat 686 
Leu Arg Gly Lys 
185 

ccgtagtaat aaattctcag aggacccaaa aaaaaaaaaa aa 728 

40 <210> 94 
<211> 582 
<212> DNA 

<213> Homo sapiens 

45 <220> 

<221> CDS 
<222> 247. .510 

<220> 

50 <221> sig_peptide 
<222> 247 . .318 
<223> Von Heijne matrix 

score 5.20026065148038 

seq FCALEWLPSCDC/RS 

55 

<400> 94 

atcatactca ccatggctca caaactgcct gtttgaaact cccttcagtt ctgagaggat 60 
gggaacattc tttaagcggt tegtcttgge acgagacata aggcagttca acatcaagcc 120 
cttgccctga acagttccaa atgccaagaa ctggcgaatt actactttgg tttcaatggg 180 
60 tgttccaaaa ggatcatcaa gcttcaggag ctttctgacc ttgaagaaag ggaaaatgaa 240 
gatagc atg gtg cca ctt ccg aag caa age ctg aag ttc ttc tgt get 288 
Met Val Pro Leu Pro Lys Gin Ser Leu Lys Phe Phe Cys Ala 
-20 -15 

106 
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tta gaa gtg gtg ttg cca tec tgt gat tgc agg agt cct ggc att ggc 336 
Leu Glu Val Val Leu Pro Ser Cys Asp Cys Arg Ser Pro Gly lie Gly 
-10 -5 15 

ttg gtg gag gag cct atg gat aag gtg gag gaa gga cca tta tea ttc 384 
5 Leu Val Glu Glu Pro Met Asp Lys Val Glu Glu Gly Pro Leu Ser Phe 
10 15 20 

ctt atg aaa agg aag aca gec cag aag ctt get att cag aag get ttg 432 
Leu Met Lys Arg Lys Thr Ala Gin Lys Leu Ala lie Gin Lys Ala Leu 
25 30 35 

10 tea gat gca ttc cag aaa ctg ttg att gtt gtt eta ggt aag act gtc 480 
Ser Asp Ala Phe Gin Lys Leu Leu lie Val Val Leu Gly Lys Thr Val 

40 45 50 

ttg ate ate ctt gaa gta ctt cag ttt cag taagcaaata aactcatttt 530 
Leu lie lie Leu Glu Val Leu Gin Phe Gin 
15 55 60 

gaaaagttaa ttgaataaaa atattgatat ctaaagcaaa aaaaaaaaaa aa 582 

<210> 95 

<211> 1913 

20 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
25 <222> 143 . .592 

<220> 

<221> sig_peptide 
<222> 143 . .277 
30 <223> Von Heijne matrix 

score 5.94057630118762 

seq VLVDLAI LGQAYA/ FA 

<400> 95 

35 atttttttgt gectaagatg cccagtgcgt tgctgggttt ttctgctgtc ctcgggctct 60 
ggacatgagg ccagaccttg tgaccttgtt ggcagtgggc agtggcttga tgtgaggtcc 120 
cagagaegge aggttcatca ag atg gtg etc atg tgg acc agt ggt gac gee 172 

Met Val Leu Met Trp Thr Ser Gly Asp Ala 
-45 -40 
40 ttc aag acg gee tac ttc ctg ctg aag ggt gee cct ctg cag ttc tec 220 
Phe Lys Thr Ala Tyr Phe Leu Leu Lys Gly Ala Pro Leu Gin Phe Ser 
-35 -30 -25 -20 

gtg tgc ggc ctg ctg cag gtg ctg gtg gac ctg gee ate ctg ggg cag 268 
Val Cys Gly Leu Leu Gin Val Leu Val Asp Leu Ala lie Leu Gly Gin 
45 ^ -15 -10 -5 

gee tac gee ttc gee cca ccc cca gaa gee ggc gee cca cgc cgt gca 316 
Ala Tyr Ala Phe Ala Pro Pro Pro Glu Ala Gly Ala Pro Arg Arg Ala 

15 10 
ccc cac tgg cac caa ggc cct ctg aca gtg ggg agg acg agg atg tgg 364 
50 Pro His Trp His Gin Gly Pro Leu Thr Val Gly Arg Thr Arg Met Trp 
15 20 25 

gac cgc cag ccg egg gca ctg gtg ggc cct gac etc ccc gcg ggg agg 412 
Asp Arg Gin Pro Arg Ala Leu Val Gly Pro Asp Leu Pro Ala Gly Arg 
30 35 40 45 

55 gtg ggt gee gtg gee cct gca ggt gtg gca gag atg ggg cac ggg cat 460 
Val Gly Ala Val Ala Pro Ala Gly Val Ala Glu Met Gly His Gly His 

50 55 60 

t99 99t etc cat cag cct ctg tgg ggt gtc tea ggg tgg gca gtg ggg 508 
Trp Gly Leu His Gin Pro Leu Trp Gly Val Ser Gly Trp Ala Val Gly 
60 65 70 75 

9tg ggg ctg gga cgc tgt ttg tgc tea gcg ggg aca gee agg gtt gat 556 
Val Gly Leu Gly Arg Cys Leu Cys Ser Ala Gly Thr Ala Arg Val Asp 
80 85 90 
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ctg gcc ccg agg gtt ttg gat gtt ttt agg atg aca taaaaagcaa 602 
Leu Ala Pro Arg Val Leu Asp Val Phe Arg Met Thr 
95 100 105 

gtgttttccc catttcctct tatgaaacac cgtctgagcc caaggtacac attgggcggc 662 

5 ctgcaggaac ctgctccagg tggacacacg ggccagcagc cgcgaacctt gaagctgggg 722 

tgaccgcagg agaccctgta aggcctgtga gcggagccct cgaccccgtg acaccctggc 782 

cagacaccct gcttggactg gggtggcctc tgctacccag gggtctggca cgggggaggg 842 

ctggggcttt ctctgcctgg tacacacgga aaggcggctg tgcggacgca gggtcaccgt 902 

gctccgggtt ttctgacagt cggtgtttcc tgggcctttg gagtggctgc gaggcctgaa 962 

10 cgccttgtgg atccgctgtg tccagcccgg ctgagcatcg ccagggctag ctcatgctgc 1022 

tcttgtcagc ctctggttct cctcgagtcc ttggggacgt ggcagatgcc agcgaccatc 1082 

agacaacgtg gaggccctca tgggcaatgg ctgagggggc cgggctgagg ctgtgcacat 1142 

gcagtctgca cgccactctt gggctctgct ggcggagatc cccttccttc tgggtgcaga 1202 

ctgcacctcc ggatgcagtt ttgatgtcca tcttccagga gagagacggt ctcgggtcca 1262 

15 gggagtggag ggggctgccc ctgccgtgca ggtcctggcc gatggcgcct taccctgctg 1322 

ccctgggctt ttggcctgaa gcaaattcct gagtgggggg tactggggcc tgccgcatcc 1382 

tgtcctgtcc actgcccacc cccgtgtgct ggctccctca cttctggctg cagtgggagc 1442 

cgccagtctg acccttgtca ccgcacgctc tgcccccacc ccgttgcaag aggtcacacc 1502 

atgtcagcag ccttgcactg accgcagccg gcccccaggc ctcagagttc tggatgcttc 1562 

20 cgtgcggctc caacaggcat cgtcttccct tccgcaggtg gaggggccgc ttcccgcagg 1622 

catctgagct ctgtgccggg gccgtggcca tgggaagatg ttccacgctg cctcctcctc 1682 

gagttttcct cggaaacact cttgaatgtc tgagtgaggg tcctgcttag ctctttggcc 1742 

tgtgagatgc tttgaaaatt tttatttttt taagatgaag caagatgtct gtagcggtaa 1802 

ttgcctcaca ttaaactgtc gccgactgca ggcgcagtga ctgctgaatg taccctgtgt 1862 

25 ggcgacttgg aatcaataaa ccatttgtgg atcctaaaaa aaaaaaaaaa a 1913 

<210> 96 
<211> 670 
<212> DNA 
30 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 33 . .458 

35 

<220> 

<221> sig_peptide 
<222> 33 . . 89 
<223> Von Heijne matrix 
40 score 6.4523 9823575329 

seq SVFLLMVNGQVES / AQ 

<400> 96 

aggtgggtcc ccccggcacc cccagacctg cc atg gcg acc gcg agt cct age 53 
45 Met Ala Thr Ala Ser Pro Ser 

-15 

gtc ttt eta etc atg gtc aac ggg cag gtg gag age gcc cag ttt cca 101 
Val Phe Leu Leu Met Val Asn Gly Gin Val Glu Ser Ala Gin Phe Pro 
-10 -5 1 

50 gag tat gat gac ttc tac tgc aag tac tgc ttt gtg tac ggc cag gac 14 9 

Glu Tyr Asp Asp Phe Tyr Cys Lys Tyr Cys Phe Val Tyr Gly Gin Asp 
5 10 15 20 

tgg gcc ccc aca gcg ggt ctg gag gag ggg ate tea cag ate aca tec 197 
Trp Ala Pro Thr Ala Gly Leu Glu Glu Gly lie Ser Gin lie Thr Ser 

55 25 30 35 

aag age caa gat gtg egg caa gca ctg gtg tgg aac ttc ccc att gat 245 
Lys Ser Gin Asp Val Arg Gin Ala Leu Val Trp Asn Phe Pro lie Asp 

40 45 50 

gtc acc ttt aaa age acc aac ccc tac ggc tgg cca cag ate gtg etc 293 

60 Val Thr Phe Lys Ser Thr Asn Pro Tyr Gly Trp Pro Gin lie Val Leu 
55 60 65 

a 9C gtg tat gga cca gat gtg ttc ggg aac gat gtg gtt cga ggc tat 341 
Ser Val Tyr Gly Pro Asp Val Phe Gly Asn Asp Val Val Arg Gly Tyr 

108 
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10 



15 



20 



70 75 80 

999 gee gtg cac gtg ccc ttc tea cct ggc egg cac aaa agg ace ate 
Gly Ala Val His Val Pro Phe Ser Pro Gly Arg His Lys Arg Thr lie 
85 90 95 100 

ccc atg ttt gtc cca gaa tct acg tct aaa ctg cag aag ttt aca aga 
Pro Met Phe Val Pro Glu Ser Thr Ser Lys Leu Gin Lys Phe Thr Arg 

105 110 115 

tct gca age tgc tec ace cac tgaggacaaa tagaaacagg tcccctggga 
Ser Ala Ser Cys Ser Thr His 
120 

gtgctgagtc acggggctcc cttcagccct gttccagcag cagaaggecg ggegatttta 
ccctgtgccc tgtgaaaaat ctttgtgtct gagggggcag aggaaaaact cttgtcagat 
gggaaaaatg ctcatgacat aatgtgacat taaaaggtgg gaaacaaaaa aaaaaaaaaa 
aa 

<210> 97 

<211> 939 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 1. 



389 



437 



488 



548 
608 
668 
670 



336 



25 <220> 

<221> sig_peptide 
<222> 1 . . 81 

<223> Von Heijne matrix 

score 3.68137078794859 
30 seq AHLCSDSLPESQQ/QD 



35 



40 



45 



50 



55 



60 



<400> 97 










act 


tec 


gaa 


gag 


aga 


acc 


gee 


Thr 


Ser 


Glu 


Glu 


Arg 


Thr 


Ala 






-25 










etc 


tgc 


tec 


gac 


age 


etc 


ccg 


Leu 


Cys 


Ser 


Asp 


Ser 


Leu 


Pro 




-10 










-5 


gca 


ccc 


aac 


ttc 


tec 


age 


cac 


Ala 


Pro 


Asn 


Phe 


Ser 


Ser 


His 










10 






gac 


atg 


aca 


agg 


cgc 


tgc 


atg 


Asp 


Met 


Thr 


Arg 


Arg 


Cys 


Met 



-20 



cag cag 



15 



25 



30 



cca gee ccg 
Pro Ala Pro 
40 

acc gee cat 
Thr Ala His 
55 

tac cgt gca 
Tyr Arg Ala 
70 

tgaggggagg 
gttggcattt 
tggagtttgc 
agetaeggag 
acccgggacc 
ctgatcctag 
tgtgatgagg 
cttaaaagaa 
ctccaagagg 
ttcatctaat 



ggg teg teg ccc ccg cgc tgc cat 
Gly Ser Ser Pro Pro Arg Cys His 
45 

get gca gcg gga aag aga aca gag 
Ala Ala Ala Gly Lys Arg Thr Glu 
60 

gag ggc ttg aga agg ggc egg gtc 
Glu Gly Leu Arg Arg Gly Arg Val 
75 80 
gctgcagacc gccgctcttc cagttcccgc 
eggggectgg caaatccccg ccccgcctcc 
ttctctgtag ttgggcagct gctcttggtc 
aacccgcctt aggtagaaag aaagtgattt 
etaactgett aatgeatatt tagatcgttt 
tggtttagta atataaacct tttctatgtt 
gaatcccttc cacgaattac tttgtagtcc 
ettgeagatt tggaatgtga cgtgttttct 
ctaatttttt tgtaaagatt ttgtgggagc 
gacatcctct gacaataaaa aatgtttaaa 

109 



ggg ggt gee gee cac 
Gly Gly Ala Ala His 
-15 

caa gac ggc aac cac 
Gin Asp Gly Asn His 
1 5 
cgt cgc cag egg sec 
Arg Arg Gin Arg Xaa 
2 0 

ggt ttc ccc tea tec 
Gly Phe Pro Ser Ser 
35 

ctg aga ccc ggt agt 
Leu Arg Pro Gly Ser 
50 

agt cct ggg gac agg 
Ser Pro Gly Asp Arg 
65 

gcg ggg gca agg gta 
Ala Gly Ala Arg Val 
85 

catcctccgc gagctcaggc 
gegcagggge tactgggagt 
tagtgaccac cagcctggac 
ttttcctttg caagagtttg 
tctgtacgtt gtcagttcta 
gtgggtgaaa ttatgtaacc 
agcgtgcacg ctagttcata 
ctttcagtaa cttcacgcct 
tatgtaatga gatggggagt 
ttccccaaaa aaaaaaaaaa 



48 



96 



144 



192 



240 



288 



336 



396 
456 
516 
576 
636 
696 
756 
816 
876 
936 
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aaa 939 

<210> 98 
<211> 661 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 174 . .443 

<220> 

<221> sig_jpeptide 
<222> 174 . .269 
15 <223> Von Heijne matrix 

score 4.13107367257584 

seq SSLAFCQVGFLTA/QP 

<400> 98 

20 aaaaaggaac tttcagtgat aatgaacaaa actcaggagc tatgtggatg acaggagcac 60 
ctagatgacc gactttaccc acttcaaatg ctaccttgac cctagcactc tctccaccct 120 
gcatcctcac ctcagaccat cagttggtta ggccaacagc tcaccatcaa ttc atg 176 

Met 

ccc tgc eta gac caa cag etc act gtt cat gec eta ccc tgc cct gee 224 

25 Pro Cys Leu Asp Gin Gin Leu Thr Val His Ala Leu Pro Cys Pro Ala 
-30 -25 -20 

cag ccc tec tct ctg gee ttc tgc caa gtg ggg ttc tta aca gca cag 272 
Gin Pro Ser Ser Leu Ala Phe Cys Gin Val Gly Phe Leu Thr Ala Gin 
-15 -10 -5 1 

30 cct tea cct ccg aga agg cgc aat ggg aaa gac aga tac acg ttg gtt 320 
Pro Ser Pro Pro Arg Arg Arg Asn Gly Lys Asp Arg Tyr Thr Leu Val 

5 10 15 

ctg caa cac cag gaa tgc cag gat gat tta gee ace tec tea ctt gtc 368 
Leu Gin His Gin Glu Cys Gin Asp Asp Leu Ala Thr Ser Ser Leu Val 

35 20 25 30 

tac ctt tec etc ccc tgc ttc aaa gac ttg ggt cga teg aag cac caa 416 
Tyr Leu Ser Leu Pro Cys Phe Lys Asp Leu Gly Arg Ser Lys His Gin 

35 40 45 

age ate act gtt get gac act aac aag tagtgccaag ggattgeett 463 

40 Ser lie Thr Val Ala Asp Thr Asn Lys 
50 55 

taaggaagat caggagegga acatctggtg gcaaagaaaa tctttctaat agccccattc 523 
tagtgaccac cttcaacctc ctcatagcag gagagtttgg gagtagggga cttaggatgt 583 
tttgttcttt taatcaattc agaaaatatg tatgtttgaa ataaaaataa aaatacttga 643 
45 gecaaaaaaa aaaaaaaa 661 

<210> 99 
<211> 647 
<212> DNA 
50 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 282 . .521 

55 

<220> 

<221> sig_peptide 

<222> 282 . .386 

<223> Von Heijne matrix 

60 score 3.64439944 832387 
seq LEPGLSSSAACNG/KE 

<400> 99 

110 
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acttgcgtgt caccgttacc gtagcgactg ggcttctgga 
gtcaacatct tcgagcatcg gcagctccgg aggccggggt 
tgtgaaagaa tctcctgatg tcataatttc cgggtgtcac 
cctttggcaa ttccagcctt ctgtggaaag gccagtagaa 
5 tacaggaatc agactcagcc tcttttggtt ttcagtgaag 



10 



15 



20 



25 



30 



gga acc cag cca agg agg ttt cca gtg gaa gga 
Gly Thr Gin Pro Arg Arg Phe Pro Val Glu Gly 
-30 -25 -20 

gag ctg gaa cct ggg ctg age tec agt get gee 
Glu Leu Glu Pro Gly Leu Ser Ser Ser Ala Ala 

-10 -5 
atg tea cca acc agg caa etc egg agg tgc cct 
Met Ser Pro Thr Arg Gin Leu Arg Arg Cys Pro 

5 10 
aca ata act gat gtt ccc gtc act gtt tat gca 
Thr lie Thr Asp Val Pro Val Thr Val Tyr Ala 

20 25 
cct gca caa age age aag gaa atg cat cct aaa 
Pro Ala Gin Ser Ser Lys Glu Met His Pro Lys 
35 40 45 

tcaaggtctg actaggtcaa gggtaatgga ccagtatcat 
taaaagtggt ggcaccttta gatgatgaca aaaaaaaaaa 

<210> 100 

<211> 1006 

<212> DNA 

<213> Homo sapiens 



ctgtatatcc tagctgeett 
aactggcagg taggaaacta 
eggaacattt gatcatcatt 
agcattgatt tattcacctc 
t atg cct ttt caa ttt 

Met Pro Phe Gin Phe 

-35 

gga gat tct tea att 
Gly Asp Ser Ser lie 
-15 

tgt aat ggg aag gag 
Cys Asn Gly Lys Glu 
1 

gga agt cat tgc ctg 
Gly Ser His Cys Leu 
15 

aca acg aga aag cca 
Thr Thr Arg Lys Pro 
30 

tagcaccatt aagtcttttg 



ctggtgatct ggtaaacaaa 
aaaaaa 



60 
120 
180 
240 
296 



344 



392 



440 



488 



541 



601 
647 



<220> 
<221> CDS 
<222> 251. 



643 



35 <220> 

<221> sig_peptide 

<222> 251 . .295 

<223> Von Heijne matrix 

score 3-74215118492367 
40 seq LLMFTQLLLCGFL/YV 

<400> 100 

aggaagecag agggctggaa atacagcagc ctttgaagta ccctctgtta atttggatgg 
atctcagtgt gccccgttcg agacctctcc accaacacct tctgatcttg egatttgetc 
45 ttcttgactt taattagtat ctaggaaagt ctaaactttg gacctacctc tttttttgat 
actcattttt gtacttttgc tctctgggat tggtttctta aagaatctgg atccttttta 
atatgtcaaa atg agt ctg ctg atg ttt aca caa eta ctg etc tgt gga 
Met Ser Leu Leu Met Phe Thr Gin Leu Leu Leu Cys Gly 
-15 -10 -5 

50 ttt tta tat gtt egg gtt gat gga teg cgt ctt cgc cag gag gac ttt 
Phe Leu Tyr Val Arg Val Asp Gly Ser Arg Leu Arg Gin Glu Asp Phe 

15 10 
ccc ccg egg att gtg gag cat cct tec gat gtc ate gtc tct aag ggc 
Pro Pro Arg He Val Glu His Pro Ser Asp Val He Val Ser Lys Gly 
55 15 20 25 30 

gag ccc acg act ctg aac tgc aag gcg gag ggc egg cca acg ccc acc 
Glu Pro Thr Thr Leu Asn Cys Lys Ala Glu Gly Arg Pro Thr Pro Thr 

35 40 45 

att gag tgg tac aaa gat ggg gag cga gtg gag act gac aag gac gat 
60 He Glu Trp Tyr Lys Asp Gly Glu Arg Val Glu Thr Asp Lys Asp Asp 
50 55 60 

ccc egg tec cac agg atg ctt ctg ccc age gga tec tta ttc ttc ttg 
Pro Arg Ser His Arg Met Leu Leu Pro Ser Gly Ser Leu Phe Phe Leu 



60 
120 
180 
240 
289 



337 



385 



433 



481 



529 



111 
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10 



15 



65 

cgc ate gtg 
Arg lie Val 
80 

tgt gtt gcg 
Cys Val Ala 
95 

ctg gaa gtg 
Leu Glu Val 

cttttattta 
aatcaacaca 
caagecaggg 
taagaaagca 
ggcgggcgga 
tctctacaaa 



70 75 
cac.ggg cgc agg agt aaa cct gat gaa gga age 
His Gly Arg Arg Ser Lys Pro Asp Glu Gly Ser 

85 90 
agg aac tat ctt ggt gaa gca gtg agt cga aat 
Arg Asn Tyr Leu Gly Glu Ala Val Ser Arg Asn 

100 105 
gca tgt aag tgaacataat gaacctcatg tgcacattt 
Ala Cys Lys 
115 

tttcaagtaa gttttgatgt gttcccatag aegctgaaac 
ctgeataatt ttacttggtc ttcttcagag aagtctggtc 
tgttgtagta agtttgttta tatgaaatca agatgaccaa 
ggccgggcgc ggtggctcac gectgtaate ccagcacttt 
tcacgaggtc aggagatcga gaccatcctg ggtagcaegg 
aaatacaaaa aaaaaaaaaa aaa 



tac gtt 
Tyr Val 

gcg tct 
Ala Ser 
110 



ctaaagaatc 
aagatagtat 
tatgttatta 
gggaggcgga 
tggggccccg 



577 



625 



673 



733 
793 
853 
913 
973 
1006 



<210> 101 

<211> 1059 

20 <212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

25 <222> 179. .475 



<220> 

<221> sig_peptide 
<222> 179. .295 
30 <223> Von Heijne matrix 

score 4.14109371250204 
seq PSLIAGLFVGCLA/GY 



<400> 101 

35 gtttttccag gagggagegg cctttgctca gcgcgagacg gctgggcgcc gagtgggaca 60 

gcgctggtgc ggagactget tccggactcc aggtaccgeg ettggeggea gctggcccca 120 

gacttctgtc ttttcagctg cagtgaaggc teggggctge agaattgeaa ccttgcca 178 





atg 


gac 


ctg 


ate 


ggt 


ttt 


ggt 


tat 


gca 


gee 


etc 


gtg 


aca 


ttt 


gga 


age 


226 




Met 


Asp 


Leu 


He 


Gly 


Phe 


Gly 


Tyr 


Ala 


Ala 


Leu 


Val 


Thr 


Phe 


Gly 


Ser 




40 










-35 










-30 










-25 








att 


ttt 


gga 


tat 


aag 


egg 


aga 


ggt 


ggt 


gtt 


ccg 


tct 


ttg 


att 


get 


ggt 


274 




He 


Phe 


Gly 


Tyr 


Lys 


Arg 


Arg 


Gly 


Gly 


Val 


Pro 


Ser 


Leu 


He 


Ala 


Gly 












-20 










-15 










-10 










ctt 


ttt 


gtt 


gga 


tgt 


ttg 


gee 


ggc 


tat 


gga 


get 


tac 


cgt 


gtc 


tec 


aat 


322 


45 


Leu 


Phe 


Val 


Gly 


Cys 


Leu 


Ala 


Gly 
1 


Tyr 


Gly 


Ala 


Tyr 
5 


Arg 


Val 


Ser 


Asn 






gac 


aaa 


-5 
cga 


gat 


gta 


aaa 


gtg 


tea 


ctg 


ttt 


aca 


get 


ttc 


ttc 


ctg 


get 


370 




Asp 


Lys 


Arg 


Asp 


val 


Lys 


Val 


Ser 


Leu 


Phe 


Thr 


Ala 


Phe 


Phe 


Leu 


Ala 






10 










15 










20 










25 




50 


acc 


ata 


atg 


ggt 


gtg 


aga 


ttt 


aag 


agg 


tec 


aag 


aaa 


ata 


atg 


cct 


get 


418 




Thr 


He 


Met 


Gly 


val 


Arg 


Phe 


Lys 


Arg 


Ser 


Lys 


Lys 


He 


Met 


Pro 


Ala 














30 










35 










40 








ggt 


ttg 


gtt 


gca 


ggt 


tta 


age 


etc 


atg 


atg 


ate 


ctg 


aga 


ctt 


gtc 


ttg 


466 




Gly 


Leu 


Val 


Ala 


Gly 


Leu 


Ser 


Leu 


Met 


Met 


He 


Leu 


Arg 


Leu 


Val 


Leu 




55 






45 










50 










55 









ttg ctg etc tgagcatctg gaggaacaga aaactaagtt catgtcatcc 515 
Leu Leu Leu 



60 

tgctgtaatg ggcagagcat attttttttg tatttaaaag ataaacttca atatggaatg 575 

60 ctagaaacac aaatagcact gtcacctcta atatgaacat tagtttgagg tagttttttt 635 

ctaaagcaaa aattttaact gttttctaat tgtcaagcac tattttcatt aaaagtgtct 695 

aatgaatcat gatatactct tccatttgtt gtgtctattt tttatatatt tggtattttt 755 

tgaaaattcc aaatactcat gtctcaagta agcttaaact acaacttgtc acataaagga 815 

112 
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agtcttaagt ggagttcaca gaatgataat gtatctattt gtcatttgtg ttatatttga 875 

aattattaga aattatgctt tttccatttt aattgtattg ctgccagtgc tatttttttc 935 

tttaaaaaat tttattctta gcacactgtt atgtcctaac tgaatgtatt cagtattcaa 995 

ataaaagaca ttttggtcca aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1055 

5 aaaa 1059 

<210> 102 
<211> 514 
<212> DNA 
10 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 34 . .327 

15 

<220> 

<221> sig_peptide 
<222> 34 . . 162 
<223> Von Heijne matrix 
20 score 5.69273078757386 

seq LGDALLFLRPAGS/CA 

<400> 102 

agagacctat ggggttcgcc tgaagccccc gga atg tgt gag aca ctt ctt act 54 
25 Met Cys Glu Thr Leu Leu Thr 

-40 

agt aaa tgg get tea gta tec ccc ate cct gca etc ctg cag gaa ggt 102 

Ser Lys Trp Ala Ser Val Ser Pro lie Pro Ala Leu Leu Gin Glu Gly 

-35 -30 -25 

30 gag aat egg gac agt cgc agg ctg gga gac get ctg ctt ttc ctg cgt 150 

Glu Asn Arg Asp Ser Arg Arg Leu Gly Asp Ala Leu Leu Phe Leu Arg 

-20 -15 -10 -5 

cct get ggg age tgc gcg etc cag gta tec tgg cct gee gee eta gee 198 

Pro Ala Gly Ser Cys Ala Leu Gin Val Ser Trp Pro Ala Ala Leu Ala 

35 1 5 10 

ggc cca agg age cac aca gga cag ttg ace caa cac ttc tgc cac ctg 246 

Gly Pro Arg Ser His Thr Gly Gin Leu Thr Gin His Phe Cys His Leu 

15 20 25 

aag aac gac acc tgc att cct cca tct ctg gga cca cca agg aac tea 294 

40 Lys Asn Asp Thr Cys lie Pro Pro Ser Leu Gly Pro Pro Arg Asn Ser 

3 0 3 5 4 0 

ggg age ttg gaa tct etc aga tea aaa aga tac tgactcatcg gatagecatg 347 
Gly Ser Leu Glu Ser Leu Arg Ser Lys Arg Tyr 
45 50 55 

45 gcatcctgaa aacggccttc cttgtgtgta cattatttgc aacaagcaac aagtttataa 407 

gcactttggt aaaattgeat gtgagggtta aaatattaaa gtcagtgcgt caacttgaaa 467 

taaatgatga gttattgatt actgetaaag aaaaaaaaaa aaaaaaa 514 

<210> 103 
50 <211> 1158 
<212> DNA 

<213> Homo sapiens 

<220> 
55 <221> CDS 

<222> 303 . .953 

<220> 

<221> sig_peptide 
60 <222> 303 . .359 

<223> Von Heijne matrix 

score 5.47911600153114 

seq LCCSGCVPSLCCS/SY 

113 
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<400> 103 

aaaaacttcc gccgccgcgt ccgccgcctc cggaactaaa cggggtgagg tcacattcgg 
ttatctctaa cgttggaaaa cgatggagct aacacccatt atggagatta accacttttc 
5 atcaggtttt taacttaagt cgtgaggaat acaacggtga acacaagatt cattttattt 
tcatcaccat gggacgtatc ctgttgttga gttctctggg tcagacctct gaagacttct 
cagatggatc ctagtctctg ggcttgccct gaaattactc gctgctcagg gagagagttg 
aa atg gtt ggc ate etc cca etc tgt tgc tec ggc tgt gtc ccc teg 
Met Val Gly lie Leu Pro Leu Cys Cys Ser Gly Cys Val Pro Ser 
10 -15 -10 -5 

etc tgt tgt tec age tat gtc ccc tct gtt get cca act gca get cat 
Leu Cys Cys Ser Ser Tyr Val Pro Ser Val Ala Pro Thr Ala Ala His 

1 5 10 

tct gtt aga gtt cct cat tea get ggt cac tgt ggc cag agg gtg ttg 
15 Ser Val Arg Val Pro His Ser Ala Gly His Cys Gly Gin Arg Val Leu 
15 20 25 

gee tgc tec ctt cct caa gta ttc tta aag cca tgg att ttt gtg gag 
Ala Cys Ser Leu Pro Gin Val Phe Leu Lys Pro Trp He Phe Val Glu 
30 35 40 

20 cat ttt tct tec tgg etc tec ctt gag tta ttt tec ttt ctt cgc tat 
His Phe Ser Ser Trp Leu Ser Leu Glu Leu Phe Ser Phe Leu Arg Tyr 
45 50 55 60 

ctt ggg act ctt ctt tgt get tgc gga cat egg ttg aga gaa gga cga 
Leu Gly Thr Leu Leu Cys Ala Cys Gly His Arg Leu Arg Glu Gly Arg 
25 ' 65 ' 70 75 

ctt ctt cct tgt etc ctt ggt gtt ggc teg tgg ttg etc ttc aac aac 
Leu Leu Pro Cys Leu Leu Gly Val Gly Ser Trp Leu Leu Phe Asn Asn 

80 85 90 

tgg act gga ggc tct tgg ttt tct ctt cat ctt caa caa gtc agt etc 
30 Trp Thr Gly Gly Ser Trp Phe Ser Leu His Leu Gin Gin Val Ser Leu 
95 100 105 

tct caa ggg tct cac gtt gca gca ttc tta cca gag gee att ggg cct 
Ser Gin Gly Ser His Val Ala Ala Phe Leu Pro Glu Ala He Gly Pro 
110 115 120 

35 gga gtt cca gtt cca gtg tct gga gag tec ace tea get cag caa tct 
Gly Val Pro Val Pro Val Ser Gly Glu Ser Thr Ser Ala Gin Gin Ser 
125 130 135 140 

cat gee ggt tgg caa ttg tea gca gaa gee gat gee tgc cca tea gtt 
His Ala Gly Trp Gin Leu Ser Ala Glu Ala Asp Ala Cys Pro Ser Val 
40 145 150 155 

ctt tac tct gag gtg tta gag tgg aat aaa aat ata aat act tat act 
Leu Tyr Ser Glu Val Leu Glu Trp Asn Lys Asn He Asn Thr Tyr Thr 

160 165 170 

agt ttt cat gac ttc tgc tta ata ttg ggt att ttt ktt gtt ttg ttt 
45 Ser Phe His Asp Phe Cys Leu He Leu Gly He Phe Xaa Val Leu Phe 
175 180 185 

tgt ttt ggc ggt gat agg ctt ace tta cat taaaccaggc ettagecttt 
Cys Phe Gly Gly Asp Arg Leu Thr Leu His 
190 195 
50 ctgtggcttt gttatggcaa agectcatat tactctctag tctggttcag caggacagtc 
aggtccacac ctggggctgt ttgttttcta cgtttacctc aacataaggt accttatcat 
tgtcagcett catctcctga tccaaaataa aataaaatgc cacaggtcaa aaaaaaaaaa 
aaaaa 



60 
120 
180 
240 
300 
347 



395 



443 



491 



539 



587 



635 



683 



731 



779 



827 



875 



923 



973 



1033 
1093 
1153 
1158 



55 <210> 104 

<211> 1563 

<212> DNA 

<213> Homo sapiens 



60 <220> 

<221> CDS 
<222> 97 . . 645 



114 
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<220> 

<221> sig_peptide 
<222> 97 . . 156 
<223> Von Heijne matrix 
5 score 8.42885652997473 

seq AWGCLLVPPAEA/NK 

<220> 

<221> misc_f eature 
10 <222> 972 

<223> n=a, g, c or t 

<400> 104 

aatagaagct aggagagggc ggggacaact gggtcttttg cggctgcagc gggcttgtag 60 
15 gtgtccggct ttgctggccc agcaagcctg ataagc atg aag etc tta tct ttg 114 

Met Lys Leu Leu Ser Leu 
-20 -15 
gtg get gtg gtc ggg tgt ttg ctg gtg ccc cca get gaa gee aac aag 162 
Val Ala Val Val Gly Cys Leu Leu Val Pro Pro Ala Glu Ala Asn Lys 
20 -10 -5 1 

agt tct gaa gat ate egg tgc aaa tgc ate tgt cca cct tat aga aac 210 
Ser Ser Glu Asp lie Arg Cys Lys Cys lie Cys Pro Pro Tyr Arg Asn 

5 10 15 

ate agt ggg cac att tac aac cag aat gta tec cag aag gac tgc aac 258 
25 lie Ser Gly His lie Tyr Asn Gin Asn Val Ser Gin Lys Asp Cys Asn 
20 25 30 

tgc ctg cac gtg gtg gag ccc atg cca gtg cct ggc cat gac gtg gag 306 
Cys Leu His Val Val Glu Pro Met Pro Val Pro Gly His Asp Val Glu 
35 40 45 50 

30 gee tac tgc ctg ctg tgc gag tgc agg tac gag gag cgc age acc ace 354 
Ala Tyr Cys Leu Leu Cys Glu Cys Arg Tyr Glu Glu Arg Ser Thr Thr 

55 60 65 

acc ate aag gtc ate att gtc ate tac ctg tec gtg gtg ggt gec ctg 402 
Thr lie Lys Val lie lie Val lie Tyr Leu Ser Val Val Gly Ala Leu 
35 70 75 80 

ttg etc tac atg gee ttc ctg atg ctg gtg gac cct ctg ate cga aag 450 
Leu Leu Tyr Met Ala Phe Leu Met Leu Val Asp Pro Leu lie Arg Lys 

85 90 95 

ccg gat gca tac act gag caa ctg cac aat gag gag gag aat gag gat 4 98 

40 Pro Asp Ala Tyr Thr Glu Gin Leu His Asn Glu Glu Glu Asn Glu Asp 
100 105 110 

get cgc tct atg gca gca get get gca tec etc ggg gga ccc cga gca 546 
Ala Arg Ser Met Ala Ala Ala Ala Ala Ser Leu Gly Gly Pro Arg Ala 
115 120 125 130 

45 aac aca gtc ctg gag cgt gtg gaa ggt gec cag cag egg tgg aag ctg 594 
Asn Thr Val Leu Glu Arg Val Glu Gly Ala Gin Gin Arg Trp Lys Leu 

135 140 145 

cag gtg cag gag cag egg aag aca gtc ttc gat egg cac aag atg etc 642 
Gin Val Gin Glu Gin Arg Lys Thr Val Phe Asp Arg His Lys Met Leu 
50 150 155 160 

age tagatgggct ggtgtggttg ggtcaaggee ccaacaccat ggctgccagc 695 
Ser 

ttccaggctg gacaaagcag ggggctactt ctcccttccc tcggttccag tcttcccttt 755 
aaaagcctgt ggcatttttc ctccttctcc ctaactttag aaatgttgta cttggctatt 815 

55 ttgattaggg aagagggatg tggtctctga tctctgttgt cttcttgggt ctttggggtt 875 
gaagggaggg ggaaggcagg cccasaaggg aatggagaca ttcgaggegg cctcaggagt 935 
ggatgegate ttgtctctcc tkggcctccc actcttngcc gccttccagc tctgagtctt 995 
gggaatgttg ttacccttgg aagataaagy ctgggtcttc aggaactcag tgtctgggag 1055 
gaaagcatgg cccagcattc agcatgtgtt cctttctgca gtggttctta tcaccacctc 1115 

60 cctcccagcc ccagcgcctc agccccagcc ccagctccag ccctgaggac agctctgatg 1175 
ggagagctgg gccccctgag cccactgggt cttcagggtg cactggaagc tggtgttcgc 1235 
tgtcccctgt gcacttctcg cactggggca tggagtgccc atgeataetc tgctgccggt 1295 
cccctcacct gcacttgagg ggtctgggca gtccctcctc tccccagtgt ccacagtcac 1355 

115 
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10 



tgagccagac ggtcggttgg aacatgagac tcgaggctga gcgtggatct gaacaccaca 1415 

gcccctgtac ttgggttgcc tcttgtccct gaacttcgtt gtaccagtgc atggagagaa 1475 

aattttgtcc tcttgtctta gagttgtgtg taaatcaagg aagccatcat taaattgttt 1535 

tatttctctc taaaaaaaaa aaaaaaaa 1563 

<210> 105 
<211> 1621 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 80 . . 820 

15 <220> 

<221> sig_peptide 

<222> 80. .118 

<223> Von Heijne matrix 

score 5.76690322882439 
20 seq MLVLRSALTRALA/SR 

<400> 105 

acctttccac tcgggaaacc ttcagaggag tctcagaaag gacacggctg gctgcttttc 60 

tcagcgccga agccgcgcc atg etc gtc etc aga age gee ctg act egg gcg 112 

25 Met Leu Val Leu Arg Ser Ala Leu Thr Arg Ala 

-10 -5 

ctg gec tea egg acg ctg gcg cct cag atg tgc tea tct ttt get acg 160 

Leu Ala Ser Arg Thr Leu Ala Pro Gin Met Cys Ser Ser Phe Ala Thr 

15 10 

30 gga ccc aga caa tac gat gga ata ttc tat gaa ttt cgt tct tat tac 208 

Gly Pro Arg Gin Tyr Asp Gly lie Phe Tyr Glu Phe Arg Ser Tyr Tyr 

15 20 25 30 

ctt aag ccc tea aag atg aat gag ttc ctg gaa aat ttt gag aaa aac 256 

Leu Lys Pro Ser Lys Met Asn Glu Phe Leu Glu Asn Phe Glu Lys Asn 

35 35 40 45 

get cat ctt egg aca get cac tct gaa ttg gtt gga tac tgg agt gta 304 

Ala His Leu Arg Thr Ala His Ser Glu Leu Val Gly Tyr Trp Ser Val 

50 55 60 

gaa ttt gga ggc aga atg aat aca gtg ttt cat att tgg aag tat gat 352 

40 Glu Phe Gly Gly Arg Met Asn Thr Val Phe His lie Trp Lys Tyr Asp 

65 70 75 

aat ttt get cat cga act gaa gtt cag aaa gee ttg gee aaa gat aag 400 

Asn Phe Ala His Arg Thr Glu Val Gin Lys Ala Leu Ala Lys Asp Lys 

80 85 90 

45 gaa tgg caa gaa caa ttc etc att cca aat ttg get etc att gat aaa 448 

Glu Trp Gin Glu Gin Phe Leu lie Pro Asn Leu Ala Leu lie Asp Lys 

95 100 105 110 

caa gag agt gag att act tat ctg gta cca tgg tgc aaa tta gaa aaa 496 

Gin Glu Ser Glu lie Thr Tyr Leu Val Pro Trp Cys Lys Leu Glu Lys 

50 115 120 125 

cct cca aaa gaa gga gtc tat gaa ctg gee act ttt cag atg aaa cct 544 

Pro Pro Lys Glu Gly Val Tyr Glu Leu Ala Thr Phe Gin Met Lys Pro 

130 135 140 

ggt ggg cca get ctg tgg ggt gat gca ttt aaa agg gca gtt cat get 592 

55 Gly Gly Pro Ala Leu Trp Gly Asp Ala Phe Lys Arg Ala Val His Ala 

145 150 155 

cat gtc aat eta ggc tac aca aaa eta gtt gga gtg ttc cac aca gag 640 

His Val Asn Leu Gly Tyr Thr Lys Leu Val Gly Val Phe His Thr Glu 

160 165 170 

60 tac gga gca etc aac aga gtt cat gtt ctt tgg tgg aat gag agt gca 688 

Tyr Gly Ala Leu Asn Arg Val His Val Leu Trp Trp Asn Glu Ser Ala 

175 180 185 190 

gat agt cgt gca get ggg aga cat aag tec cat gag gat ccc aga gtt 736 

116 
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Asp Ser Arg Ala Ala Gly Arg His Lys Ser His 

195 200 
gtg gca get gtt egg gaa agt gtc aac tac eta 
Val Ala Ala Val Arg Glu Ser Val Asn Tyr Leu 
5 210 215 

atg ctt ctg att cct aca teg ttt tea cca ctg 
Met Leu Leu lie Pro Thr Ser Phe Ser Pro Leu 

225 230 
ctgaaataca aaacatttca ttaactgeta taggatctct 

10 ctcccaagag gttctcactt ttatttgaag gaggtggtaa 
gcattatgaa ggctacatct gtgctttgta agtaccactt 
tttctgcatg gtatttcagt gtctgtcata cattaaaaat 
cttgactctt catttgtttc agaatagctc ttctactgta 
atagcatttt gttgtattca aatgataatg gtagcatttc 

15 taagttatta atatatttta tcaacctttc catcatgtct 
ttgttttttg accagtaaaa tttattttgt aataccaaat 
tatttcttta ctatggaaaa ccacattgtc atttgtgaca 
tttcacatta gttatttgtc acttacttgg aaaatgatgc 
aatctagaaa agacttgttg gtttatgtgc tgaaatgtct 

20 ctactattta ctttatttcg gatcctgttt aacaaagata 
taatgaaatc tgtatggata tggaaatget tgecctaata 
aaaaaaaaaa a 



Glu Asp Pro Arg Val 
205 

gta tct cag cag aat 
Val Ser Gin Gin Asn 
220 

aaa tagttttcta 
Lys 



ctgctaatgg 
gttaatttgc 
eaaaaaatag 
acttgtcact 
ttctgacaac 
catgcttgtg 
gttttcctgg 
aggatttaag 
tcatctatat 
tgttaggtcc 
ttatttataa 
cttgagacat 
aaagectaca 



tgcttaaatt 
tatgtttctt 
ttctgtttac 
gttttaagat 
tetttgettt 
acagcatttt 
ttttttttgg 
aaaattaacg 
taaatatggt 
tggtattaaa 
ttaattttaa 
ccatttgttt 
tatacaaaaa 



784 



830 



890 
950 
1010 
1070 
1130 
1190 
1250 
1310 
1370 
1430 
1490 
1550 
1610 
1621 



<210> 106 

25 <211> 557 

<212> DNA 

<213> Homo sapiens 

<220> 

30 <221> CDS 

<222> 77. .388 

<220> 

<221> sig_peptide 

35 <222> 77. .217 

<223> Von Heijne matrix 

score 4.57105404339594 
seq FLYLTLNQSCI FA/NY 



40 <400> 106 

aacaccctcc ctggaccctc tgcctggagg aeggggaate acagcagctg gtttggggtg 
cctcccaaac caaaag atg ttc tct ccg cgc caa get ttg acg ccc gac ccc 

Met Phe Ser Pro Arg Gin Ala Leu Thr Pro Asp Pro 
-45 -40 
45 ctg cac tct ccc gee tac tea ccg gtc eta ggg ggt tgg tec cgc ttt 
Leu His Ser Pro Ala Tyr Ser Pro Val Leu Gly Gly Trp Ser Arg Phe 
-35 -30 -25 -20 

cgt agt gtg gat ttt cgt ttc etc tac ttg act eta aat caa tec tgt 
Arg Ser Val Asp Phe Arg Phe Leu Tyr Leu Thr Leu Asn Gin Ser Cys 
50 -15 -10 -5 

ata ttc gca aac tac aaa gag gcg cat gca aat aga tac tgt act gag 
lie Phe Ala Asn Tyr Lys Glu Ala His Ala Asn Arg Tyr Cys Thr Glu 

1 5 10 

ggc aga tac acg cgc gag ate cag agg ctt aca tec cca gee get tgg 
55 Gly Arg Tyr Thr Arg Glu lie Gin Arg Leu Thr Ser Pro Ala Ala Trp 
15 20 25 

ccc ace aga gac aag aac agg atg ata age aat gga atg gca ttg aac 
Pro Thr Arg Asp Lys Asn Arg Met lie Ser Asn Gly Met Ala Leu Asn 
30 35 40 45 

60 tct cct get gaa gga ctt gca ttt caa tgt aga ttc tgaggctggg 
Ser Pro Ala Glu Gly Leu Ala Phe Gin Cys Arg Phe 

50 55 
tgaaaacttc tctgtcacct ttactacagc attctcaccc atttatattt ctttcccctt 

117 



60 
112 



160 



208 



256 



304 



352 



398 
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ctacatctct attactgttg cactatgtta tgcattacac catggcaaaa ttaatcaatt 518 
aatacaataa aagcttaatt ttaaaaaaaa aaaaaaaaa 557 

<210> 107 
5 <211> 600 
<212> DNA 
<213> Homo sapiens 

<220> 
10 <221> CDS 

<222> 139. .513 

<220> 

<221> sig_peptide 
15 <222> 139 . .201 

<223> Von Heijne matrix 

score 5.86857787719223 

seq IVMGVQWGRAFA/RA 

20 <400> 107 

gaggcaatgc gcatgcccag cgccgtatcg cgcacgctct ctgcggcttt ccttgacctc 60 
tgacccgccg accacgcttg atccccggcc gcggggccag gaagtcggag tttgagcccc 120 
ggaggcagag cggctgcc atg gcc aag tac ctg gcc cag ate att gtg atg 171 

Met Ala Lys Tyr Leu Ala Gin lie lie Val Met 
25 -20 " -15 

ggc gtg cag gtg gtg ggc agg gcc ttt gca egg gcc ttg egg cag gag 219 
Gly Val Gin Val Val Gly Arg Ala Phe Ala Arg Ala Leu Arg Gin Glu 
-10 -5 15 

ttt gca gcc age egg gcc gca get gat gcc cga gga cgc get gga cac 267 
30 Phe Ala Ala Ser Arg Ala Ala Ala Asp Ala Arg Gly Arg Ala Gly His 
10 15 20 

egg tct gca gcc get tec aac etc tec ggc etc age etc cag gag gca 315 
Arg Ser Ala Ala Ala Ser Asn Leu Ser Gly Leu Ser Leu Gin Glu Ala 
25 30 35 

35 cag cag att etc aac gtg tec aag ctg age cct gag gag gtc cag aag 363 
Gin Gin lie Leu Asn Val Ser Lys Leu Ser Pro Glu Glu Val Gin Lys 

40 45 50 

aac tat gaa cac tta ttt aag gtg aat gat aaa tec gtg ggt ggc tec 411 
Asn Tyr Glu His Leu Phe Lys Val Asn Asp Lys Ser Val Gly Gly Ser 
40 55 60 65 70 

ttc tac ctg cag tea aag gtg gtc cgc gca aag gag cgc ctg gat gag 459 
Phe Tyr Leu Gin Ser Lys Val Val Arg Ala Lys Glu Arg Leu Asp Glu 

75 80 85 

gaa etc aaa ate cag gcc cag gag gac aga gaa aaa ggg cag atg ccc 507 
45 Glu Leu Lys lie Gin Ala Gin Glu Asp Arg Glu Lys Gly Gin Met Pro 
90 95 100 

cat acg tgactgeteg gctccccccg cccaccccgc cgcctctaat ttatagcttg 563 
His Thr 

gtaataaatt tcttttctac aaaaaaaaaa aaaaaaa 600 



50 



55 



<210> 108 

<211> 1129 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> 81. . 986 



60 <220> 

<221> sig_peptide 

<222> 81. . 134 

<223> Von Heijne matrix 



118 
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score 5.03543461931947 
seq ITLLGLAVNWTT/LV 

<400> 108 

5 acagcgcggc gggcgtctcg ctgctcgagc cgccgctgca gctctactgg acctggctgc 60 
tccagtggat cccgctctgg atg gcc ccc aac tec ate acc ctg ctg ggg etc 113 

Met Ala Pro Asn Ser lie Thr Leu Leu Gly Leu 
-15 -10 

gcc gtc aac gtg gtc acc acg etc gtg etc ate tec tac tgt ccc acg 161 

10 Ala Val Asn Val Val Thr Thr Leu Val Leu lie Ser Tyr Cys Pro Thr 
-5 15 

gcc acc gaa gag gca cca tac tgg aca tac ctt tta tgt gca ctg gga 209 

Ala Thr Glu Glu Ala Pro Tyr Trp Thr Tyr Leu Leu Cys Ala Leu Gly 

10 15 20 25 

15 ctt ttt att tac cag tea ctg gat get att gat ggg aaa caa gcc aga 257 

Leu Phe lie Tyr Gin Ser Leu Asp Ala lie Asp Gly Lys Gin Ala Arg 

30 35 40 

aga aca aac tct tgt tec cct tta ggg gag etc ttt gac cat ggc tgt 305 

Arg Thr Asn Ser Cys Ser Pro Leu Gly Glu Leu Phe Asp His Gly Cys 

20 45 50 55 

gac tct ctt tec aca gta ttt atg gca gtg gga get tea att gcc get 353 

Asp Ser Leu Ser Thr Val Phe Met Ala Val Gly Ala Ser lie Ala Ala 

60 65 70 

cgc tta gga act tat cct gac tgg ttt ttt ttc tgc tct ttt att ggg 401 

25 Arg Leu Gly Thr Tyr Pro Asp Trp Phe Phe Phe Cys Ser Phe lie Gly 
75 80 85 

atg ttt gtg ttt tat tgc get cat tgg cag act tat gtt tea ggc atg 449 

Met Phe Val Phe Tyr Cys Ala His Trp Gin Thr Tyr Val Ser Gly Met 

90 95 100 105 

30 ttg aga ttt gga aaa gtg gat gta act gaa att cag ata get tta gtg 497 

Leu Arg Phe Gly Lys Val Asp Val Thr Glu lie Gin lie Ala Leu Val 

110 115 120 

att gtc ttt gtg ttg tct gca ttt gga gga gca aca atg tgg gac tat 545 

lie Val Phe Val Leu Ser Ala Phe Gly Gly Ala Thr Met Trp Asp Tyr 

35 125 130 135 

acg ggc acc agt gtc ttg tea cct gga etc cac ata gga eta att att 593 

Thr Gly Thr Ser Val Leu Ser Pro Gly Leu His lie Gly Leu lie lie 

140 145 150 

ata ctg gca ata atg ate tat aaa aag tea gca act gat gtg ttt gaa 641 

40 lie Leu Ala lie Met lie Tyr Lys Lys Ser Ala Thr Asp Val Phe Glu 
155 160 165 

aag cat cct tgt ctt tat ate eta atg ttt gga tgt gtc ttt get aaa 689 

Lys His Pro Cys Leu Tyr lie Leu Met Phe Gly Cys Val Phe Ala Lys 

170 175 180 185 

45 gtc tea caa aaa tta gtg gta get cac atg acc aaa agt gaa eta tat 737 

Val Ser Gin Lys Leu Val Val Ala His Met Thr Lys Ser Glu Leu Tyr 

190 195 200 

ctt caa gac act gtc ttt ttg ggg cca ggt ctt ttg ttt tta gac cag 785 

Leu Gin Asp Thr Val Phe Leu Gly Pro Gly Leu Leu Phe Leu Asp Gin 

50 205 210 215 

tac ttt aat aac ttt ata gac gaa tat gtt gtt eta tgg atg gca atg 833 

Tyr Phe Asn Asn Phe lie Asp Glu Tyr Val Val Leu Trp Met Ala Met 

220 225 230 

gtg att tct tea ttt gat atg gtg ata tac ttt agt get ttg tgc ctg 881 

55 Val lie Ser Ser Phe Asp Met Val lie Tyr Phe Ser Ala Leu Cys Leu 
235 240 245 

caa att tea aga cac ctt cat eta aat ata ttc aag act gca tgt cat 929 

Gin lie Ser Arg His Leu His Leu Asn lie Phe Lys Thr Ala Cys His 

250 255 260 265 

60 caa gca cct gaa cag gtt caa gtt ctt tct tea aag agt cat cag aat 977 

Gin Ala Pro Glu Gin Val Gin Val Leu Ser Ser Lys Ser His Gin Asn 

270 275 280 

aac atg gat tgaagagact tccgaacact tgetatctet tgetgetget 1026 
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Asn Met Asp 

gtttcatgga aggagatatt aaacatttgt ttaattttta tttaagtgtt atacctattt 1086 
cagcaaataa aatatttcat tgcttgaaaa aaaaaaaaaa aaa 1129 

5 <210> 109 
<211> 778 
<212> DNA 
<213> Homo sapiens 

10 <220> 

<221> CDS 
<222> 266 . .586 

<220> 

15 <221> sig_peptide 
<222> 266 . .307 
<223> Von Heijne matrix 

score 4.534746808071 

seq ILVTVPGVCPAQC/CW 

20 

<400> 109 

tagatgttag aattggtatt tgttcttgct ttttggttgc gatggagtta tatactaagt 60 
tacttatact aaggcattag tagtctcata tctgaggagc aattgtattt ttagttcagc 120 
taaattaatg cctcttttta aatactaact tgtactactt ttgtggctgt gaatggtatc 180 
25 ttttattgaa ctgaggcagc ttttaaaaga cttgcctgat catttagagc actcccattg 240 
aggttaaatt agacttgaat ctgta atg att etc gta act gtt cct ggt gtg 292 

Met lie Leu Val Thr Val Pro Gly Val 
-10 

tgt cca gca caa tgt tgc tgg gca gag cag agg ggc aga ggc tea ggt 340 

30 Cys Pro Ala Gin Cys Cys Trp Ala Glu Gin Arg Gly Arg Gly Ser Gly 
-5 1 5 10 

atg tac ttc att gac aag tgg gca agg cca tec tgg gta cca cat tgg 388 
Met Tyr Phe lie Asp Lys Trp Ala Arg Pro Ser Trp Val Pro His Trp 
15 20 25 

35 ctt aat gat etc ttc att gtg aag tec ggc tac etc gtt tgc ata aga 436 
Leu Asn Asp Leu Phe lie Val Lys Ser Gly Tyr Leu Val Cys lie Arg 

30 35 40 

act aca gta ate agg caa ggc att gtc aga att ggg agg aat aaa ate 484 
Thr Thr Val lie Arg Gin Gly lie Val Arg lie Gly Arg Asn Lys lie 

40 45 50 55 

agt gag tct gga agg agt get ctg tat aca att gca aag aac aaa atg 532 

Ser Glu Ser Gly Arg Ser Ala Leu Tyr Thr lie Ala Lys Asn Lys Met 

60 65 70 75 

gtc ate ttt aag gta cct gat tgc atg cac tta aat gca gat tat ttt 580 

45 Val lie Phe Lys Val Pro Asp Cys Met His Leu Asn Ala Asp Tyr Phe 

80 85 90 

gga gtt tgaaaaggga ctattaatga aatctttctt ttccctcctt tctctttttc 636 
Gly Val 

ccttccccgc cactgattca gtgagctgga gattggatca cagecgaagg agtaaaggtg 696 
50 ctgeaatgat gttagctgtg gccactgtgg atttttcgea agaacattaa taaactaaaa 756 
acttcaaaaa aaaaaaaaaa aa 778 

<210> 110 
<211> 1301 
55 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
60 <222> 59. . 745 

<220> 

<221> sig_peptide 

120 
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<222> 59 . . 160 

<223> Von Heijne matrix 

score 5.94384548075359 

seq LGAAALALLLANT/DV 

5 

<400> 110 

attcaaaacc aggctgaaga ttggaaggaa gttggccagc ctcggctgca ggacagaa 58 
atg tct ttc etc cag gac cca agt ttc ttc acc atg ggg atg tgg tec 106 
Met Ser Phe Leu Gin Asp Pro Ser Phe Phe Thr Met Gly Met Trp Ser 

10 -30 -25 -20 

att ggt gca gga gee ctg ggg get get gee ttg gca ttg ctg ctt gee 154 
lie Gly Ala Gly Ala Leu Gly Ala Ala Ala Leu Ala Leu Leu Leu Ala 

-15 -10 -5 

aac aca gac gtg ttt ctg tec aag ccc cag aaa gcg gee ctg gag tac 202 

15 Asn Thr Asp Val Phe Leu Ser Lys Pro Gin Lys Ala Ala Leu Glu Tyr 
1 5 10 

ctg gag gat ata gac ctg aaa aca ctg gag aag gaa cca agg act ttc 250 
Leu Glu Asp lie Asp Leu Lys Thr Leu Glu Lys Glu Pro Arg Thr Phe 
15 20 25 30 

20 aaa gca aag gag eta tgg gaa aaa aat gga get gtg att atg gee gtg 298 
Lys Ala Lys Glu Leu Trp Glu Lys Asn Gly Ala Val lie Met Ala Val 

35 40 45 

egg agg cca ggc tgt ttc etc tgt cga gag gaa get gcg gat ctg tec 346 
Arg Arg Pro Gly Cys Phe Leu Cys Arg Glu Glu Ala Ala Asp Leu Ser 

25 50 55 60 

tec ctg aaa age atg ttg gac cag ctg ggc gtc ccc etc tat gca gtg 394 
Ser Leu Lys Ser Met Leu Asp Gin Leu Gly Val Pro Leu Tyr Ala Val 

65 70 75 

gta aag gag cac ate agg act gaa gtg aag gat ttc cag cct tat ttc 442 

30 Val Lys Glu His lie Arg Thr Glu Val Lys Asp Phe Gin Pro Tyr Phe 
80 85 90 

aaa gga gaa ate ttc ctg gat gaa aag aaa aag ttc tat ggt cca caa 490 
Lys Gly Glu lie Phe Leu Asp Glu Lys Lys Lys Phe Tyr Gly Pro Gin 
95 100 105 110 

35 agg egg aag atg atg ttt atg gga ttt ate cgt ctg gga gtg tgg tac 538 
Arg Arg Lys Met Met Phe Met Gly Phe lie Arg Leu Gly Val Trp Tyr 

115 120 125 

aac ttc ttc cga gee tgg aac gga ggc ttc tct gga aac ctg gaa gga 586 
Asn Phe Phe Arg Ala Trp Asn Gly Gly Phe Ser Gly Asn Leu Glu Gly 

40 130 135 140 

gaa ggc ttc ate ctt ggg gga gtt ttc gtg gtg gga tea gga aag cag 634 
Glu Gly Phe lie Leu Gly Gly Val Phe Val Val Gly Ser Gly Lys Gin 

145 150 155 

ggc att ctt ctt gag cac cga gaa aaa gaa ttt gga gac aaa gta aac 682 

45 Gly lie Leu Leu Glu His Arg Glu Lys Glu Phe Gly Asp Lys Val Asn 
160 165 170 

eta ctt tct gtt ctg gaa get get aag atg ate aaa cca cag act ttg 730 
Leu Leu Ser Val Leu Glu Ala Ala Lys Met lie Lys Pro Gin Thr Leu 
175 180 185 190 

50 gee tea gag aaa aaa tgattgtgtg aaactgccca gctcagggat aaccagggac 785 
Ala Ser Glu Lys Lys 
195 

attcacctgt gttcatggga tgtattgttt ccactcgtgt ccctaaggag tgagaaaccc 845 
atttatactc tactctcagt atggattatt aatgtatttt aatattctgt ttaggcccac 905 

55 taaggcaaaa tagccccaaa acaagactga caaaaatctg aaaaactaat gaggattatt 965 
aagctaaaac ctgggaaata ggaggtttaa aattgactgc caggctgggt gcagtggctc 1025 
acacctgtaa tcccagcact ttgggaggcc aaggtgagca agtcacttga ggtcgggagt 1085 
tcgagaccag cctgagcaac atggcgaaac cccgtctcta ctaaaaatac aaaaatcacc 1145 
cgggtgtggt ggcaggcacc tgtagtccca gctacccggg aggctgaggc aggagaatca 1205 

60 cttgaacctg ggaggtggag gttgcggtga gctgagatca caccactgta ttccagcctg 1265 
ggtgactgag actctaacta aaaaaaaaaa aaaaaa 1301 

<210> 111 
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<211> 1300 

<212> DNA 

<213> Homo sapiens 

5 <220> 

<221> CDS 
<222> 59. .676 

<220> 

10 <221> sig_peptide 
<222> 59 . . 160 
<223> Von Heijne matrix 

score 5.94384548075359 
seq LGAAALALLLANT/DV 

15 

<400> 111 

attcaaaacc aggctgaaga ttggaaggaa gttggccagc ctcggctgca ggacagaa 58 
atg tct ttc etc cag gac cca agt ttc ttc acc atg ggg atg tgg tec 106 
Met Ser Phe Leu Gin Asp Pro Ser Phe Phe Thr Met Gly Met Trp Ser 

20 -30 -25 -20 

att ggt gca gga gec ctg ggg get get gee ttg gca ttg ctg ctt gee 154 
lie Gly Ala Gly Ala Leu Gly Ala Ala Ala Leu Ala Leu Leu Leu Ala 

-15 -10 -5 

aac aca gac gtg ttt ctg tec aag ccc cag aaa gcg gee ctg gag tac 202 

25 Asn Thr Asp Val Phe Leu Ser Lys Pro Gin Lys Ala Ala Leu Glu Tyr 
15 10 
ctg gag gat ata gac ctg aaa aca ctg gag aag gaa cca agg act ttc 250 
Leu Glu Asp He Asp Leu Lys Thr Leu Glu Lys Glu Pro Arg Thr Phe 
15 20 25 30 

30 aaa gca aag gag eta tgg gaa aaa aat gga get gtg att atg gee gtg 298 
Lys Ala Lys Glu Leu Trp Glu Lys Asn Gly Ala Val He Met Ala Val 

35 40 45 

egg agg cca ggc tgt ttc etc tgt cga gag gaa get gcg gat ctg tec 346 
Arg Arg Pro Gly Cys Phe Leu Cys Arg Glu Glu Ala Ala Asp Leu Ser 

35 ~ ~ 50 55 60 

tec ctg aaa age atg ttg gac cag ctg ggc gtc ccc etc tat gca gtg 394 
Ser Leu Lys Ser Met Leu Asp Gin Leu Gly Val Pro Leu Tyr Ala Val 

65 70 75 

gta aag gag cac ate agg act gaa gtg aag gat ttc cag cct tat ttc 442 

40 Val Lys Glu His He Arg Thr Glu Val Lys Asp Phe Gin Pro Tyr Phe 
80 85 90 

aaa gga gaa ate ttc ctg gat gaa aag aaa aag ttc tat ggt cca caa 490 
Lys Gly Glu He Phe Leu Asp Glu Lys Lys Lys Phe Tyr Gly Pro Gin 
95 100 105 110 

45 agg egg aag atg atg ttt atg gga ttt ate cgt ctg gga gtg tgg tac 538 
Arg Arg Lys Met Met Phe Met Gly Phe He Arg Leu Gly Val Trp Tyr 

115 120 125 

aac ttc ttc cga gee tgg aac gga ggc ttc tct gga aac ctg gaa gga 586 
Asn Phe Phe Arg Ala Trp Asn Gly Gly Phe Ser Gly Asn Leu Glu Gly 

50 130 135 140 

gaa ggc ttc ate ctt ggg gga gtt ttc gtg gtg gga tea gga age agg 634 
Glu Gly Phe He Leu Gly Gly Val Phe Val Val Gly Ser Gly Ser Arg 

145 150 155 

gca ttc ttc ttg age acc gag aaa aag aat ttg gag aca aag 676 

55 Ala Phe Phe Leu Ser Thr Glu Lys Lys Asn Leu Glu Thr Lys 
160 165 170 

taaacctact ttctgttctg gaagctgeta agatgatcaa accacagact ttggcctcag 736 
agaaaaaatg attgtgtgaa actgcccagc tcagggataa ccagggacat tcacctgtgt 796 
tcatgggatg tattgtttcc actcgtgtcc ctaaggagtg agaaacccat ttatactcta 856 

60 ctctcagtat ggattattaa tgtattttaa tattctgttt aggcccacta aggcaaaata 916 
gccccaaaac aagactgaca aaaatctgaa aaactaatga ggattattaa gctaaaacct 976 
gggaaatagg aggtttaaaa ttgactgeca ggctgggtgc agtggctcac acctgtaatc 1036 
ccagcacttt gggaggccaa ggtgagcaag tcacttgagg tegggagtte gagaccagcc 1096 
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tgagcaacat ggcgaaaccc cgtctctact aaaaatacaa aaatcacccg ggtgtggtgg 1156 

caggcacctg tagtcccagc tacccgggag gctgaggcag gagaatcact tgaacctggg 1216 

aggtggaggt tgcggtgagc tgagatcaca ccactgtatt ccagcctggg tgactgagac 1276 

tctaactaaa aaaaaaaaaa aaaa 1300 

<210> 112 
<211> 1617 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 15. .278 

15 <220> 

<221> sig_peptide 

<222> 15 . . 146 

<223> Von Heijne matrix 

score 12.2610572403264 
20 seq PLFLLLLLGSVTA/DI 

<400> 112 

gagaggagag gaga atg gcg gcg gaa ggc tgg att tgg cgt tgg ggc tgg 5 0 
Met Ala Ala Glu Gly Trp lie Trp Arg Trp Gly Trp 

25 -40 -35 

ggc egg egg tgc ctg gga agg cct ggg ctt etc ggc ccc ggc cct ggc 98 
Gly Arg Arg Cys Leu Gly Arg Pro Gly Leu Leu Gly Pro Gly Pro Gly 

-30 -25 -20 

ccc act aca cct etc ttt ctt ctt ttg ttg ttg ggg tct gtg act gcg 146 

30 Pro Thr Thr Pro Leu Phe Leu Leu Leu Leu Leu Gly Ser Val Thr Ala 
-15 -10 -5 

gat ata act gac ggc aac att gaa cat etc aag egg gag cat teg etc 194 
Asp lie Thr Asp Gly Asn lie Glu His Leu Lys Arg Glu His Ser Leu 
15 10 15 

35 att aag ccc tac caa ggg gtc ggt tec age tec ccc tct ggg act tec 242 
lie Lys Pro Tyr Gin Gly Val Gly Ser Ser Ser Pro Ser Gly Thr Ser 

20 25 30 

agg gca gca eta tgc tea cga gee agt acg tac gtc tgacccctga 2 88 
Arg Ala Ala Leu Cys Ser Arg Ala Ser Thr Tyr Val 

40 35 4 0 

cgagcgcagc aaagagggct ctatctggaa ccaccagccg tgcttcctca aagactggga 34 8 

aatgcacgtc cacttcaaag tccacggcac agggaagaag aacctccatg gagaeggcat 408 

cgccttgtgg tacacccggg accgcctcgt gccagggcct gtgtttggaa gcaaagataa 468 

cttccacggc ttagecatet tcctggacac ctaccccaat gatgagacca ctgagcgcgt 528 

45 gttcccgtac atctcggtga tggtgaacaa tggctccctg tcctacgacc acagcaagga 588 

tgggcgctgg accgagctgg cgggctgcac ggctgacttc cgcaaccgcg atcacgacac 64 8 

cttcctggct gtgegctact cccggggccg tetgaeggtg atgaccgacc tggaggacaa 708 

gaacgagtgg aagaactgea ttgacatcac gggagtgcgc ctgcccaccg gctactactt 768 

cggggcctcc gccggcaccg gcgacctgtc tgacaatcat gacatcatct ccatgaagct 828 

50 gttccagctg atggtggagc acacgcccga cgaggagagc atcgactgga ccaagatcga 888 

gcccagcgtc aacttcctca agtcgcccaa agacaaegtg gacgacccca eggggaaett 94 8 

ccgcagcggg cccctgacgg ggtggcgggt gttcctgctg ctgctgtgcg ctctcctggg 1008 

catcgttgtc tgcgccgtgg tgggggccgt ggtgttccag aageggcagg ageggaacaa 1068 

gcgcttctac tgagtggcgc ctccggcggg gcctgtccct gggcccagga gccaatgtga 1128 

55 actttttttt tacegggatt ataaaagaac aacaagatga ccttatttct taactgtttc 1188 

aaataaatga ttaaagtatt ttcatacatt ttgcttcttg cccagcaggg acaggtggca 1248 

gagecgagge ttagggtctg gcacccccca cagctggaga eggaggctet cctggggctg 1308 

gtgtctcagg agcaggggtc tgtgtctaca gatgggctgt ggcccctgca ggcagctgtt 1368 

gaacactgga gggtcccccg gaccacactg gggtgggctc ctgaggacgt ggggaagtga 1428 

60 ttttgttttg tggtgtgtgg cacgtgtggc gaeggataag gectgaactg gaaacccagg 14 88 

ccttcctgtt caccctgagc tgcttcctga gaeagatget caagtgaggc tgcaggcgcg 154 8 

gtgtggtggg gccgagtgtg acegtttget aaataaagtg aaatacccaa caaaaaaaaa 1608 

aaaaaaaaa 1617 
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<210> 113 

<211> 1634 

<212> DNA 

5 <213> Homo sapiens 



10 



15 



<220> 
<221> CDS 
<222> 167, 



,619 



<220> 

<221> sig_peptide 

<222> 167 . .262 

<223> Von Heijne matrix 

score 6.8501239662158 
seq LLSSCGLPPSTAS/AV 



<400> 113 

gtacttccga gagagattaa agattcaatg gaactctgcg tctctcatct ggaacccagg 60 
20 acacagaaca agggagggaa gaaaagctca gccttaaaca tagcaaggtg aaacctttgt 120 
cctggggaat agtctggccc gctccttgga accacactca gactca atg gac tct 175 

Met Asp Ser 
-30 

gcc tea aat ccc acc aac ctt gtc age acc tec caa agg cac egg ccc 223 
25 Ala Ser Asn Pro Thr Asn Leu Val Ser Thr Ser Gin Arg His Arg Pro 

-25 -20 -15 

ttg ctt tea tec tgt ggc etc cca cca age act gcc tea get gtg cgc 271 
Leu Leu Ser Ser Cys Gly Leu Pro Pro Ser Thr Ala Ser Ala Val Arg 
-10 -5 1 

30 agg eta tgc tec agg gga gtg tta aaa gga tea aat gaa aga agg gat 319 
Arg Leu Cys Ser Arg Gly Val Leu Lys Gly Ser Asn Glu Arg Arg Asp 

5 10 15 

atg gaa tea ttt tgg aaa eta aat cgt tec cca ggg teg gac cga tac 367 
Met Glu Ser Phe Trp Lys Leu Asn Arg Ser Pro Gly Ser Asp Arg Tyr 
35 20 " 25 30 35 

ctg gag age cgc gat gcc tct cga ctg agt ggc egg gac ccc tec tea 415 
Leu Glu Ser Arg Asp Ala Ser Arg Leu Ser Gly Arg Asp Pro Ser Ser 

40 45 50 

tgg aca gtc gag gat gtg atg cag ttt gtc egg gaa get gat cct cag 463 
40 Trp Thr Val Glu Asp Val Met Gin Phe Val Arg Glu Ala Asp Pro Gin 
55 60 65 

ctt gga ccc cac get gac ctg ttt cgc aaa cac gag ate gat ggc aag 511 
Leu Gly Pro His Ala Asp Leu Phe Arg Lys His Glu lie Asp Gly Lys 
70 75 80 

45 gcc ctg ctg ctg ctg cgc agt gac atg atg atg aag tac atg ggc ctg 559 
Ala Leu Leu Leu Leu Arg Ser Asp Met Met Met Lys Tyr Met Gly Leu 

85 90 95 

aag ctg ggg cct gca etc aag etc tec tac cac att gac egg ctg aag 607 
Lys Leu Gly Pro Ala Leu Lys Leu Ser Tyr His lie Asp Arg Leu Lys 
50 100 105 110 115 

cag ggc aag ttc tgaaccagga gaggcagect agacaaccaa gtggcagcag 659 
Gin Gly Lys Phe 

gtgggggcat tcttctagga atgaggggca tcagcccacc ccaggcacct cagtggggtt 719 
ccgggccacc tcaggactcc aagaggctgt gtggagccac cactcctagc cacagctgcc 779 

55 atgataagtc cttccatgaa ggactgagga gggagagtgg gggtccaggg ctggtgctgc 839 
tcttccctca gctctgccgg ggctctaagg tccctctatt tatttctcaa ccctggctgg 899 
cctctcacca ggagtttagg ctgaatgect tecaegtgat ggaggaaaag gccaactctg 959 
tcctggtctt gctgtggcac cccatcgccc cacagctcgt accttctcac cagattcccc 1019 
tgaatccaaa ctsgtggtgc aaacctctac cttttttaca aaaagatctt attgttaatt 1079 

60 tattgtttct ggcacttggg caaaccctgt agttaatact cctccccmac actagacact 1139 
gggtttcagg aggagggaga ctgccctgct ttggtcccca gagaggcect ctgeagatag 1199 
gcgtggcccc tcttcagagg acactaccct agggcacttt ctctttgagg tggagagacc 1259 
cataaagect tgaccacatc actccatatg gggaggagaa ggatccctgt caccttctcc 1319 
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tctcttcacg gggccctttt gcagccctag gcctcatctg tgggaaggga gtccctggct 1379 

tatactgccc ccaccacagc tccttgccct ggccagaact gctgtcgaag aaaatcaggc 1439 

cggaaggcca agaaggcgct aagggggatg ggagggcagg ttttccaggc tggagtcggt 14 99 

tccacccact cgcctgtcca caggcttcct tgtaagcaag tcagcagcac agctactcac 1559 

5 gctgccatct ggacttattt tatgtcaatc tgtttataaa taaaaaccaa tataggtaaa 1619 

aaaaaaaaaa aaaaa 1634 

<210> 114 
<211> 693 
10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 223 . .417 

<220> 

<221> sig_peptide 
<222> 223 . - 270 
20 <223> Von Heijne matrix 

score 4.19788230215007 

seq LACVRESTSVAWA/CK 

<400> 114 

25 ttagggggcc tgtcacccag cacgtgcatc gggggctgtc ccgggggtca ggggagggag 60 
gccagcgggc mgtgtcgggg tccgccccga ccccatccac gaccccgact cctatccgat 120 
cctatccccg gccccgctcg ggcctttccc cttgcgccct ggctcggctg gctcgacgag 180 
cagtaagttc gtagccgccc tccgaagccg ggcgtgcatg gg atg gca gag ttg 234 

Met Ala Glu Leu 

30 -15 

gcg tgc gtg cgt gag tec acc agt gtg gca tgg gca tgt aag gtg cgc 282 
Ala Cys Val Arg Glu Ser Thr Ser Val Ala Trp Ala Cys Lys Val Arg 

-10 -5 1 

gga ggg act gca cct tct cca tea ggt gca gaa ggc cac gtc atg ctg 330 

35 Gly Gly Thr Ala Pro Ser Pro Ser Gly Ala Glu Gly His Val Met Leu 
5 10 15 20 

aac aag age cga gaa gta gaa teg cca gtg tea age cgt cca cgt tgt 378 
Asn Lys Ser Arg Glu Val Glu Ser Pro Val Ser Ser Arg Pro Arg Cys 
25 30 35 

40 ggg atg ccc act gtt ccc cca gga tea etc aag acc ctg tgacttgtgg 427 
Gly Met Pro Thr Val Pro Pro Gly Ser Leu Lys Thr Leu 

40 45 
tcactgatga gtggaccaag tgaagtccac aagatggctg ctgtggctcc aggcatcacg 487 
tccacatgca aatccatcca gaggcaggaa ctgggaatag gcttggaggt ggecaggaca 547 

45 gcaagtgggc tgtctgtata aacctcccct ccacttggga aggaaaatca ccccccaagt 607 
cgattttctg tccatcttat tgatcagaga gcgttataaa ttcacccatt aaataatctg 667 
gacaagggga aaaaaaaaaa aaaaaa 693 

<210> 115 
50 <211> 784 
<212> DNA 
<213> Homo sapiens 

<220> 
55 <221> CDS 

<222> 166 . . 732 

<220> 

<221> sig_peptide 
60 <222> 166 . .237 

<223> Von Heijne matrix 

score 6.60662787180923 

seq KMVHLLVLSGAWG/MQ 
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<400> 115 

attattggtt gggggaaacc cacgagggga cgcggccgag gagggtcgct gtccacccgg 60 
gggcgtggga gtgaggtacc agattcagcc catttggccc cgacgcctct gttctcggaa 120 
5 tccgggtgct gcggattgag gtcccggttc ctaacggact gcaag atg gag gaa ggc 177 

Met Glu Glu Gly 

ggg aac eta gga ggc ctg att aag atg gtc cat eta ctg gtc ttg tea 225 
Gly Asn Leu Gly Gly Leu lie Lys Met Val His Leu Leu Val Leu Ser 
-20 -15 -10 -5 

10 ggt gec tgg ggc atg caa atg tgg gtg acc ttc gtc tea ggc ttc ctg 273 
Gly Ala Trp Gly Met Gin Met Trp Val Thr Phe Val Ser Gly Phe Leu 

1 5 10 

ctt ttc cga age ctt ccc cga cat acc ttc gga eta gtg cag age aaa 321 
Leu Phe Arg Ser Leu Pro Arg His Thr Phe Gly Leu Val Gin Ser Lys 

15 15 20 25 

etc ttc ccc ttc tac ttc cac ate tec atg ggc tgt gsc ttc ate aac 369 
Leu Phe Pro Phe Tyr Phe His lie Ser Met Gly Cys Xaa Phe lie Asn 

30 35 40 

etc tgc ate ttg get tea cag cat get tgg get cag etc aca ttc tgg 417 

20 Leu Cys lie Leu Ala Ser Gin His Ala Trp Ala Gin Leu Thr Phe Trp 
45 50 55 60 

gag gee age cag ctt tac ctg ctg ttc ctg age ctt acg ctg gec act 465 
Glu Ala Ser Gin Leu Tyr Leu Leu Phe Leu Ser Leu Thr Leu Ala Thr 
65 70 75 

25 gtc aac gee cgc tgg ctg gaa ccc cgc acc aca get gee atg tgg gec 513 
Val Asn Ala Arg Trp Leu Glu Pro Arg Thr Thr Ala Ala Met Trp Ala 

80 85 90 

ctg caa acc gtg gag aag gag cga ggc ctg ggt ggg gag gta cca ggc 561 
Leu Gin Thr Val Glu Lys Glu Arg Gly Leu Gly Gly Glu Val Pro Gly 

30 95 100 105 

age cac cag ggt ccc gat ccc tac cgc cag ctg cga gag aag gac ccc 
Ser His Gin Gly Pro Asp Pro Tyr Arg Gin Leu Arg Glu Lys Asp Pro 

110 115 120 

aag tac agt get etc cgc cag aat ttc ttc cgc tac cat ggg ctg tec 657 

35 Lys Tyr Ser Ala Leu Arg Gin Asn Phe Phe Arg Tyr His Gly Leu Ser 
125 130 135 140 

tct ctt tgc aat ctg ggc tgc gtc ctg age aat ggg etc tgt etc get 705 
Ser Leu Cys Asn Leu Gly Cys Val Leu Ser Asn Gly Leu Cys Leu Ala 
145 150 155 

40 ggc ctt gee ctg gaa ata agg age etc tagcatgggc cctgcatgct 752 
Gly Leu Ala Leu Glu lie Arg Ser Leu 
160 165 
aataaatget tctccaaaaa aaaaaaaaaa aa 784 

45 <210> 116 
<211> 804 
<212> DNA 

<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 75 . . 623 

<220> 

55 <221> sig_ peptide 
<222> 75. .215 
<223> Von Heijne matrix 

score 8.34104221735598 

seq RLLLPCLVRMALC/AP 



609 



<400> 116 

agtacggtgg ccgaegggag teagaegctg gggatgaatg aaggtgctgg gtgeaggate 60 
aacaaacagt aata atg act gaa tgt aca agt ctt cag ttt gtc age cct 110 
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Met Thr Glu Cys Thr Ser Leu Gin Phe Val Ser Pro 

-45 -40 

ttt get ttt gag gca atg cag aag gtg gat gtt gtt tgc ctg gca tct 158 

Phe Ala Phe Glu Ala Met Gin Lys Val Asp Val Val Cys Leu Ala Ser 

5 -35 -30 -25 -20 

tta agt gat cca gaa tta aga ctt ctt ctg ccc tgt ttg gta egg atg 206 

Leu Ser Asp Pro Glu Leu Arg Leu Leu Leu Pro Cys Leu Val Arg Met 

-15 -10 -5 

gca ctt tgt gca cct get gac cag age caa age tgg get cag gat aag 254 

10 Ala Leu Cys Ala Pro Ala Asp Gin Ser Gin Ser Trp Ala Gin Asp Lys 
15 10 

aaa etc ate ctt cgc ctt ctt tct gga gtg gaa get gtc aac tec att 302 

Lys Leu lie Leu Arg Leu Leu Ser Gly Val Glu Ala Val Asn Ser He 

15 20 25 

15 gtt gca ttg ttg tec gtg gac ttt cat get tta gaa caa gat gec age 350 

Val Ala Leu Leu Ser Val Asp Phe His Ala Leu Glu Gin Asp Ala Ser 

30 35 40 45 

aaa gaa cag cag ctt aga ccg agt ctt gee ctg ttg ccc agg ctg gag 398 

Lys Glu Gin Gin Leu Arg Pro Ser Leu Ala Leu Leu Pro Arg Leu Glu 
20 50 55 60 

tgc ggt ggc gtg ate teg get cac tgc aac etc cac etc ctg ggt tea 446 

Cys Gly Gly Val He Ser Ala His Cys Asn Leu His Leu Leu Gly Ser 

65 70 75 

agt gat tct tct gec tea gtc tec cga gta gat ggg act aca ggc acg 494 

25 Ser Asp Ser Ser Ala Ser Val Ser Arg Val Asp Gly Thr Thr Gly Thr 
80 85 90 

cgc cac cat gec egg ctt ttt tgt att att agt aga gac gag gtt tea 542 

Arg His His Ala Arg Leu Phe Cys He He Ser Arg Asp Glu Val Ser 

95 100 105 

30 cca tat tgg cca ggc tgg tct cga act ccc aac ctt gtg ate cac ctg 590 

Pro Tyr Trp Pro Gly Trp Ser Arg Thr Pro Asn Leu Val He His Leu 

110 115 120 125 

cct cag cct ccc aaa gta ctg gga tta ccg gcg tgagccactg tgcctggcct 643 
Pro Gin Pro Pro Lys Val Leu Gly Leu Pro Ala 
35 130 135 

atgtggtgga gtatttatta tacgtaggat gtgaatccct gaaatacaca ggcaaactaa 703 

atagcatttc agaagtaaca gaacatttta gaacacttta tacatccttt tatagcttat 763 

ttcaataaaa gataattttt atacaaaaaa aaaaaaaaaa a 804 

40 <210> 117 

<211> 484 

<212> DNA 

<213> Homo sapiens 

45 <220> 

<221> CDS 
<222> 30 . .335 

<220> 

50 <221> sig_j>eptide 
<222> 30 . .71 
<223> Von Heijne matrix 

score 4.49063834776683 

seq FLTALLWRGRIPG/RQ 

55 

<400> 117 

gcagagtctt gagcagegeg gcaggcacc atg ttc ctg act gcg etc etc tgg 53 

Met Phe Leu Thr Ala Leu Leu Trp 
-10 

60 cgc ggc cgc att ccc ggc cgt cag tgg ate ggg aag cac egg egg ccg 101 
Arg Gly Arg He Pro Gly Arg Gin Trp He Gly Lys His Arg Arg Pro 

-5 1 5 10 

egg ttc gtg teg ttg cgc gec aag cag aac atg ate cgc cgc ctg gag 14 9 
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Arg Phe Val Ser Leu Arg Ala Lys Gin Asn Met lie Arg Arg Leu Glu 

15 20 25 

ate gag gcg gag aac cat tac tgg ctg age atg ccc tac atg ace egg 197 
lie Glu Ala Glu Asn His Tyr Trp Leu Ser Met Pro Tyr Met Thr Arg 
5 30 35 40 

gag cag gag cgc ggc cac gee gcg gtg cgc agg agg gag gec ttc gag 245 
Glu Gin Glu Arg Gly His Ala Ala Val Arg Arg Arg Glu Ala Phe Glu 

45 50 55 

gee ata aag gcg gec gec act tec aag ttc ccc ccg cat aga ttc att 293 
10 Ala lie Lys Ala Ala Ala Thr Ser Lys Phe Pro Pro His Arg Phe lie 
60 65 70 

gcg gac cag etc gac cat etc aat gtc acc aag aaa tgg tec 335 
Ala Asp Gin Leu Asp His Leu Asn Val Thr Lys Lys Trp Ser 
75 80 85 

15 taatcctgag tcgtcaccct tggattttat ggatcacgga gctgaccatc tttacctggt 395 
cctggaactg aaaaactgta gcttgtgtga aaatgagect ttggaccagt ctttattaaa 455 
acaaacaaac acaaaaaaaa aaaaaaaaa 4 84 

<210> 118 
20 <211> 985 
<212> DNA 

<213> Homo sapiens 

<220> 
25 <221> CDS 

<222> 21 . . 752 

<220> 

<221> sig_peptide 
30 <222> 21 . . 107 

<223> Von Heijne matrix 

score 3.61056351168286 

seq FPLYLLNFLGLWS/WI 

35 <400> 118 

gtttttttcc cttctgagca atg gag ctt acc ate ttt ate ctg aga ctg gec 53 

Met Glu Leu Thr lie Phe lie Leu Arg Leu Ala 
-25 -20 
att tac ate ctg aca ttt ccc ttg tac ctg ctg aac ttt ctg ggc ttg 101 
40 lie Tyr lie Leu Thr Phe Pro Leu Tyr Leu Leu Asn Phe Leu Gly Leu 
-15 -10 -5 

tgg age tgg ata tgc aaa aaa tgg ttc ccc tac ttc ttg gtg agg ttc 14 9 

Trp Ser Trp lie Cys Lys Lys Trp Phe Pro Tyr Phe Leu Val Arg Phe 
1 5 10 

45 act gtg ata tac aac gaa cag atg gca age aag aag egg gag etc ttc 197 
Thr Val lie Tyr Asn Glu Gin Met Ala Ser Lys Lys Arg Glu Leu Phe 
15 20 25 30 

agt aac ctg cag gag ttt gcg ggc ccc tec ggg aaa etc tec ctg ctg 245 
Ser Asn Leu Gin Glu Phe Ala Gly Pro Ser Gly Lys Leu Ser Leu Leu 
50 35 40 45 

gaa gtg ggc tgt ggc acg ggg gec aac ttc aag ttc tac cca cct ggg 293 
Glu Val Gly Cys Gly Thr Gly Ala Asn Phe Lys Phe Tyr Pro Pro Gly 

50 55 60 

tgc agg gtg acc tgt att gac ccc aac ccc aac ttt gag aag ttt ttg 341 
55 Cys Arg Val Thr Cys lie Asp Pro Asn Pro Asn Phe Glu Lys Phe Leu 
65 70 75 

ate aag age att gca gag aac cga cac ctg cag ttt gag cgc ttt gtg 389 
lie Lys Ser He Ala Glu Asn Arg His Leu Gin Phe Glu Arg Phe Val 
80 85 90 

60 gta get gec ggg gag aac atg cac cag gtg get gat ggc tct gtg gat 437 
Val Ala Ala Gly Glu Asn Met His Gin Val Ala Asp Gly Ser Val Asp 
95 100 105 HO 

gtg gtg gtc tgc acc ctg gtg ctg tgc tct gtg aag aac cag gag egg 485 
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Leu Val Leu Cys Ser Val Lys Asn 
120 

tgc aga gtg ctg aga ccg gga ggg 
Cys Arg Val Leu Arg Pro Gly Gly 
135 

gca get gag tgt teg act tgg aat 
Ala Ala Glu Cys Ser Thr Trp Asn 
150 155 
cct gee tgg cac ctt ctg ttt gat 
10 Gin Gin Val Leu Asp Pro Ala Trp His Leu Leu Phe Asp 

165 170 
tgg aag gec ctg gag egg gec age 
Trp Lys Ala Leu Glu Arg Ala Ser 
180 185 
15 ctg aag ctg cag cac ate cag gee cca ctg tec tgg gag 

lie Gin Ala Pro Leu Ser Trp Glu 
200 

tat get gtg aaa tagtgtgagc tggcagttaa 772 
Tyr Ala Val Lys 
20 210 215 

gagctgaatg gctcaaagaa tttaaagctt cagttttaca tttaaaatgc taggtgggtg 832 
cctgtaatcc caggtacttg gaaggctgag gcaggagaat ctcttgaacc cagaaggega 892 
aggttgcagt gaaccgagat catgecattg tactctagcc tgggtgacaa gagcaagact 952 
ccgtctcaaa aaaaaataaa aaaaaaaaaa aaa 985 



25 



30 



Val 


Val 


Val 


Cys 


Thr 










115 


att 


etc 


cgc 


gag 


gtg 


lie 


Leu 


Arg 


Glu 


Val 








130 




ttc 


atg 


gag 


cat 


gtg 


Phe 


Met 


Glu 


His 


Val 






145 






caa 


caa 


gtc 


ctg 


gat 


Gin 


Gin 


Val 


Leu 


Asp 




160 








ctg 


ace 


aga 


gag 


age 


Leu 


Thr 


Arg 


Glu 


Ser 


175 










ctg 


aag 


ctg 


cag 


cac 


Leu 


Lys 


Leu 


Gin 


His 










195 


cct 


cat 


ate 


tat 


gga 


Pro 


His 


He 


Tyr 


Gly 








210 





Gin 


Glu 


Arg 






125 






get 


t- t- r* 
LLC 


L O L 


D J J 


Ala 


Phe 


Tyr 




140 








LaC 


LLC 


tgg 




Tyr 


Phe 


Trp 




ggg 


tgc 


aac 


629 


Gly 


Cys 


Asn 




ttc 


tct 


aag 


677 


Phe 


Ser 


Lys 








190 




ttg 


gtg 


cgc 


725 


Leu 


Val 


Arg 






205 







<210> 119 
<211> 839 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 185 . . 715 



35 <220> 

<221> sig_jpeptide 

<222> 185 . .253 

<223> Von Heijne matrix 

score 9.49395175807817 
40 seq SLLFICFFGESFC/IC 



<400> 119 

atattttget gactggcaag gttatatgaa gtgettttat tgaagcacca ttttaactaa 60 

cagctcctgg tattttctgc ttcccttcgt agggaattta gttattttat tttattattt 120 

45 agctaattta gctattttaa aatagctaaa ttttagctac ttttttttca attgacaaag 180 

aagg atg tct aat caa aga eta ccg ctg att ttt tct ctg ttg ttt ate 229 

Met Ser Asn Gin Arg Leu Pro Leu He Phe Ser Leu Leu Phe He 
-20 -15 -10 





tgc 


ttc 


ttc 


ggg 


gag 


agt 


ttc 


tgc 


att 


tgt 


gat 


gga 


act 


gtc 


tgg 


aca 


277 


50 


Cys 


Phe 


Phe 


Gly Glu 

_ c 


Ser 


Phe 


Cys 


He 
1 


Cys 


Asp Gly Thr 


Val 


Trp 


Thr 






aag 


gtt 


gga 


tgg 


gag 


att 


ctt 


cca 


gaa 


gaa 


gta 


cat 


tat 


tgg 


aaa 


gtt 


325 




Lys 


Val 


Gly 


Trp 


Glu 


He 


Leu 


Pro 


Glu 


Glu 


Val 


His 


Tyr 


Trp 


Lys 


val 








10 










15 










20 












55 


aag 


ggt 


tct 


cca 


tct 


cac 


tgc 


ctg 


cct 


tat 


ctt 


ctg 


gat 


aaa 


eta 


tgc 


373 




Lys 


Gly 


Ser 


Pro 


Ser 


His 


Cys 


Leu 


Pro 


Tyr 


Leu 


Leu 


Asp 


Lys 


Leu 


Cys 






25 










30 










35 










40 






tgc 


gac 


ttt 


get 


aac 


atg 


gat 


ata 


ttt 


cag 


ggt 


tgt 


tta 


tat 


etc 


att 


421 




Cys 


Asp 


Phe 


Ala 


Asn 


Met 


Asp 


He 


Phe 


Gin 


Gly Cys 


Leu 


Tyr 


Leu 


He 




60 










45 










50 










55 








tat 


aat 


tta 


tta 


caa 


get 


gtc 


ttc 


ttc 


gtc 


tta 


ttt 


gtt 


ttg 


tct 


gtg 


469 




Tyr 


Asn 


Leu 


Leu 


Gin 


Ala 


Val 


Phe 


Phe 


Val 


Leu 


Phe 


val 


Leu 


Ser 


Val 












60 










65 










70 
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cat tac ctg tgg aag aaa tgg aag aaa cac caa aaa aag ctg aaa aag 517 

His Tyr Leu Trp Lys Lys Trp Lys Lys His Gin Lys Lys Leu Lys Lys 

75 80 85 

caa gcc tec tta gaa aaa cct ggt aat gat eta gaa age cca ttg ate 565 

5 Gin Ala Ser Leu Glu Lys Pro Gly Asn Asp Leu Glu Ser Pro Leu lie 

90 95 100 

aac aac att gac caa aca etc cac aga gtg gca ace aca gca tea gtg 613 

Asn Asn lie Asp Gin Thr Leu His Arg Val Ala Thr Thr Ala Ser Val 

105 110 115 120 

10 ata tac aag ate tgg gag cac agg tct cac cat cct tec tct aag aaa 661 

lie Tyr Lys lie Trp Glu His Arg Ser His His Pro Ser Ser Lys Lys 

125 130 135 

att aag cac tgc aaa tta aag aag aag agt aaa gaa gaa gga gcc aga 709 
lie Lys His Cys Lys Leu Lys Lys Lys Ser Lys Glu Glu Gly Ala Arg 

15 140 145 150 

aga tac taaataaatg catatgeaaa tgtagcttac tcaattatag atatcacaaa 765 
Arg Tyr 

agaaatctat catctaagga ttaaaaattg ttctttggaa acctttataa aaaaaaaaga 825 

aaaaaaaaaa aaaa 839 



20 



25 



<210> 120 

<211> 583 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 54 . .527 



30 <220> 

<221> sig__peptide 

<222> 54 . .116 

<223> Von Heijne matrix 

score 6.80928714315144 
35 seq ALXSLNLAPPTVA/AP 



<400> 120 

aaegtcatet aggagcaccg agcagcttgg ctaaaagtaa gggtgtcgtg ctg atg 56 

Met 



40 


gcc 


ctg 


tgc 


gca 


ctg 


acc 


cgc 


get 


ctg 


CCS 


tct 


ctg 


aac 


ctg 


gcg 


ccc 


104 




Ala 


Leu 


Cys 


Ala 


Leu 


Thr 


Arg 


Ala 


Leu 


Pro 


Ser 


Leu 


Asn 


Leu 


Ala 


Pro 






-20 










-15 










-10 










-5 






ccg 


acc 


gtc 


gcc 


gcc 


cct 


gcc 


ccg 


agt 


ctg 


ttc 


ccc 


gcc 


gcc 


cag 


atg 


152 




Pro 


Thr 


Val 


Ala 


Ala 


Pro 


Ala 


Pro 


Ser 


Leu 


Phe 


Pro 


Ala 


Ala 


Gin 


Met 




45 










1 








5 










10 










atg 


aac 


aat 


ggc 


etc 


etc 


caa 


cag 


ccc 


tct 


gcc 


ttg 


atg 


ttg 


etc 


ccc 


200 




Met 


Asn 


Asn 
15 


Gly 


Leu 


Leu 


Gin 


Gin 
20 


Pro 


Ser 


Ala 


Leu 


Met 
25 


Leu 


Leu 


Pro 






tgc 


cgc 


cca 


gtt 


ctt 


act 


tct 


gtg 


gcc 


ctt 


aat 


gcc 


aac 


ttt 


gtg 


tec 


248 


50 


Cys 


Arg 
30 


Pro 


Val 


Leu 


Thr 


Ser 
35 


Val 


Ala 


Leu 


Asn 


Ala 
40 


Asn 


Phe 


Val 


Ser 






tgg 


aag 


agt 


cgt 


acc 


aag 


tac 


acc 


att 


aca 


cca 


gtg 


aag 


atg 


agg 


aag 


296 




Trp 


Lys 


Ser 


Arg 


Thr 


Lys 


Tyr 


Thr 


He 


Thr 


Pro 


Val 


Lys 


Met 


Arg 


Lys 






45 










50 










55 










60 




55 


tct 


999 


ggc 


cga 


gac 


cac 


aca 


ggt 


get 


gga 


aac 


gtg 


cgt 


aga 


aca 


gta 


344 




Ser 


Gly 


Gly Arg 


Asp 


His 


Thr 


Gly 


Ala 


Gly 


Asn 


Val 


Arg 


Arg 


Thr 


Val 














65 










70 










75 








ggc 


cga 


gta 


tec 


aac 


gtt 


gat 


cat 


aac 


aaa 


egg 


gtc 


att 


ggc 


aag 


gca 


392 




Gly Arg 


Val 


Ser 


Asn 


Val 


Asp 


His 


Asn 


Lys 


Arg 


Val 


He 


Gly 


Lys 


Ala 




60 








80 










85 










90 










ggt 


cgc 


aac 


cgc 


tgg 


ctg 


ggc 


aag 


agg 


cct 


aac 


agt 


ggg 


egg 


tgg 


cac 


440 




Gly Arg 


Asn 


Arg 


Trp 


Leu 


Gly 


Lys 


Arg 


Pro 


Asn 


Ser 


Gly 


Arg 


Trp 


His 





95 100 105 
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cgc aag ggg ggc tgg get ggc cga aag att egg cca eta ccc ccc atg 4 88 

Arg Lys Gly Gly Trp Ala Gly Arg Lys lie Arg Pro Leu Pro Pro Met 

110 115 120 

aag agt tac gtg aag ctg cct tct get tct gee caa age tgatatccct 537 

5 Lys Ser Tyr Val Lys Leu Pro Ser Ala Ser Ala Gin Ser 

125 130 135 

gtactctaat aaaatgcccc ccccccctca aaaaaaaaaa aaaaaa 583 

<210> 121 
10 <211> 1024 
<212> DNA 
<213> Homo sapiens 

<220> 
15 <221> CDS 

<222> 129. .686 

<220> 

<221> sig_peptide 
20 <222> 129. . 185 

<22 3> Von Heijne matrix 

score 6.45239823575329 

seq SVFLLMVNGQVES/AQ 

25 <400> 121 

ettcgegaag gtgtcgctgc caagaaacgt gtcctgcgcg ctacgccgtc tgtttctagg 60 
gcaacgccgg cgtctcttag caaccgcgcg eggectaggt gggtcccccc ggcaccccca 120 
gacctgcc atg gcg ace gcg agt cct age gtc ttt eta etc atg gtc aac 170 
Met Ala Thr Ala Ser Pro Ser Val Phe Leu Leu Met Val Asn 

30 -15 -10 

ggg cag gtg gag age gee cag ttt cca gag tat gat gac etc tac tgc 218 

Gly Gin Val Glu Ser Ala Gin Phe Pro Glu Tyr Asp Asp Leu Tyr Cys 

-5 15 10 

aag tac tgc ttt gtg tac ggc cag gac tgg gee ccc aca gcg ggt ctg 266 

35 Lys Tyr Cys Phe Val Tyr Gly Gin Asp Trp Ala Pro Thr Ala Gly Leu 
15 20 25 

gag gag ggg ate tea cag ate aca tec aag age caa gat gtg egg caa 314 
Glu Glu Gly lie Ser Gin lie Thr Ser Lys Ser Gin Asp Val Arg Gin 
30 35 40 

40 gca ctg gtg tgg aac ttc ccc att gat gtc acc ttt aaa age acc aac 362 
Ala Leu Val Trp Asn Phe Pro lie Asp Val Thr Phe Lys Ser Thr Asn 

45 50 55 

ccc tac ggc tgg cca cag ate gtg etc age gtg tat gga cca gat gtg 410 
Pro Tyr Gly Trp Pro Gin lie Val Leu Ser Val Tyr Gly Pro Asp Val 

45 60 ^ 7Q 75 

ttc ggg aac gat gtg gtt cga ggc tat ggg gee gtg cac gtg ccc ttc 458 

Phe Gly Asn Asp Val Val Arg Gly Tyr Gly Ala Val His Val Pro Phe 

80 85 90 

tea cct ggc egg cac aaa agg acc ate ccc atg ttt gtc cca gaa tct 506 

50 Ser Pro Gly Arg His Lys Arg Thr lie Pro Met Phe Val Pro Glu Ser 
95 100 105 

acg tct aaa ctg cag aag ttt aca age tgg ttc atg ggg egg egg ccc 554 

Thr Ser Lys Leu Gin Lys Phe Thr Ser Trp Phe Met Gly Arg Arg Pro 
110 115 120 

55 gag tac aca gac ccc aag gtg gtg get cag ggt gaa ggc egg gaa get 602 

Glu Tyr Thr Asp Pro Lys Val Val Ala Gin Gly Glu Gly Arg Glu Ala 

125 130 135 

ate aca get ccc egg aaa get gtc ttc tct gtc cat ggc etc acc tea 650 

lie Thr Ala Pro Arg Lys Ala Val Phe Ser Val His Gly Leu Thr Ser 

60 140 145 150 155 

ccc agg gca ctg gee ttg gtc cac ate aag ggg acc tgaagcttcc 696 
Pro Arg Ala Leu Ala Leu Val His lie Lys Gly Thr 
160 165 
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ctgaagcctc tagcctgtgg tgtgcacgta caagcctcag gccccatttg tccagcctgt 756 

cagcagctgg gaaatactaa gtcaccctct tctggttatg tttaattttc caatttttct 816 

caacattact gaaatgtcta aatgtggaaa agttgacatc attttacagt gaacaccaca 876 

tacccaccac ctagatttta ccattaccaa tttcctgttc cgtacttgta tattcacata 936 

5 tatccaacta ttcatccctg cttcaatcca tcctattttt attgcatttc aaaataaact 996 

gtgaaatcag gaaaaaaaaa aaaaaaaa 1024 

<210> 122 

<211> 760 

10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 165. .614 

<220> 

<221> sig_peptide 
<222> 165 . .305 
20 <223> Von Heijne matrix 

score 5.10820788278539 

seq ALGLALCSTKALS/VG 

<400> 122 

25 aatttccgat gccaggcacc ctcaaggcac agaggctggg gctcatgttg ggggcacttg 60 
gcctctccag gcctcgaagg cttctcctgg gctgatgcga gctggggaac gggagggacg 120 
gacgtgggag cgagaacgtc acactggagg cagctggtgg cacg atg ggg gac aga 176 

Met Gly Asp Arg 
-45 

30 gtg aaa ggt age aag tea aga gee ttc gtg tea cca tgg cca cac ace 224 
Val Lys Gly Ser Lys Ser Arg Ala Phe Val Ser Pro Trp Pro His Thr 

-40 -35 -30 

ccg atg get tec ggc ttg agg gac ccc tgg ctg cag ccc aca gee ctg 
Pro Met Ala Ser Gly Leu Arg Asp Pro Trp Leu Gin Pro Thr Ala Leu 
35 -25 -20 -15 

ggc ctt gca ctg tgc tct acg aag gee ctg tec gtg ggc tct gee cct 32 0 

Gly Leu Ala Leu Cys Ser Thr Lys Ala Leu Ser Val Gly Ser Ala Pro 

-10 -5 1 5 

ttg ccc ccg cga aat tec aac ace atg gcg gcg get gee ctg get gee 368 
40 Leu Pro Pro Arg Asn Ser Asn Thr Met Ala Ala Ala Ala Leu Ala Ala 

10 15 20 

ccc age ctg ggc ttc gat ggg gtg att ggg gtg etc gtg get gat acc 416 
Pro Ser Leu Gly Phe Asp Gly Val lie Gly Val Leu Val Ala Asp Thr 
25 30 35 

45 age etc acg gac atg cac gtg gtg gat gta gag ctg age gga ccc egg 464 
Ser Leu Thr Asp Met His Val Val Asp Val Glu Leu Ser Gly Pro Arg 

40 45 50 

ggc ccc acg ggc cga age ttt get gtg cac acc cgc aga gag aac cct 512 
Gly Pro Thr Gly Arg Ser Phe Ala Val His Thr Arg Arg Glu Asn Pro 
50 55 60 65 

gee gag cca ggc gcg gtc acc ggc tec gec acc gtc acg gee ttc tgg 560 
Ala Glu Pro Gly Ala Val Thr Gly Ser Ala Thr Val Thr Ala Phe Trp 
70 75 80 85 

egg age etc ctg gee tgc tgc cag etc ccc tec agg ccg ggg ate cat 608 
55 Arg Ser Leu Leu Ala Cys Cys Gin Leu Pro Ser Arg Pro Gly lie His 

90 95 100 

etc tgc tgagaagect cctccctccc gagacaagat catctgcctg gcctctcacc 664 
Leu Cys 

accaccatcc cacccctgcc ctgccccact tccccagggt ctcccttctg actcagtaaa 724 
60 gatcaccgct gcctccctca aaaaaaaaaa aaaaaa 760 



272 



<210> 123 
<211> 594 
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<212> DNA 

<213> Homo sapiens 

<220> 
5 <221> CDS 

<222> 192 . .476 

<220> 

<221> sig_peptide 
10 <222> 192 . .326 

<223> Von Heijne matrix 

score 6.60884760057354 

seq FILLLLLSGPAEM/SA 

15 <400> 123 

acttttattg aaaaagacta cagcaaatca tactgaggtg aatgaagaca gtgaaatgaa 60 
ggagaaggca ggtcctcttt atgttttcgc agctggttca aggggtttgg ggttttctat 120 
ctaggttaaa gattgcgtaa tacacagctg gagccataga cattaatgca tgtttatcac 180 
acgcaacaac g atg ctg cat cat gtg att aca get ggg cct gtg ctg ctt 230 
20 Met Leu His His Val lie Thr Ala Gly Pro Val Leu Leu 

-45 -40 -35 

eta cac etc cct cgc cct gac act tec ace agg ttg etc etc ace tec 278 
Leu His Leu Pro Arg Pro Asp Thr Ser Thr Arg Leu Leu Leu Thr Ser 
-30 -25 -20 

25 gtc tct get ttt ate etc tta ctg etc ctt tea gga cca gca gaa atg 326 
Val Ser Ala Phe lie Leu Leu Leu Leu Leu Ser Gly Pro Ala Glu Met 

-15 -10 -5 

tea get tec cag gaa tec ttc cct gga tct ctg cag caa gaa ata get 374 
Ser Ala Ser Gin Glu Ser Phe Pro Gly Ser Leu Gin Gin Glu lie Ala 
30 1 5 10 15 

tct ctg ate act gta gca ctt ggt tct tta ata tct tta tct tgc tct 422 
Ser Leu lie Thr Val Ala Leu Gly Ser Leu lie Ser Leu Ser Cys Ser 

20 25 30 

ace ttg tta tat ttt tct tgt gaa ctt aaa att ccc tgt gag gac gta 470 
35 Thr Leu Leu Tyr Phe Ser Cys Glu Leu Lys lie Pro Cys Glu Asp Val 
35 40 45 

aac ctt tgaaggtatg tctcatatct ctgaacctct ttaaaatgee tagcatccct 526 
Asn Leu 
50 

40 gtgtgggtgc caattgettg tgtattgaat taaattgtga ttgttaactt gaaaaaaaaa 586 
aaaaaaaa 594 

<210> 124 

<211> 559 

45 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
50 <222> 16. .297 

<220> 

<221> sig_jpeptide 
<222> 16. . 93 
55 <223> Von Heijne matrix 

score 6.65836819891491 

seq FCGSACLLAVIRA/FF 

<400> 124 

60 ttacacaggg gataa atg gca gca ate gag att gaa gtc aag cct aac cag 51 

Met Ala Ala lie Glu lie Glu Val Lys Pro Asn Gin 
-25 -20 -15 

ggc ttt tgc ggg age gca tgc ctt ttg get gta att cgt gca ttt ttt 99 

133 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 



PCT/IB00/01938 



Gly Phe Cys Gly Ser Ala Cys Leu Leu Ala Val He Arg Ala Phe Phe 

-10 -5 1 

ttt aag aaa aac gcc tgc ctt ctg cgt gag att etc cag age aaa ctg 
Phe Lys Lys Asn Ala Cys Leu Leu Arg Glu He Leu Gin Ser Lys Leu 
5 5 10 15 

ggc ggc atg ggc cct gtg gtc ttt teg tac aga ggg ctt cct ctt tgg 
Gly Gly Met Gly Pro Val Val Phe Ser Tyr Arg Gly Leu Pro Leu Trp 

20 25 30 

etc ttt gcc tgg ttg ttt cca aga tgt act gtg cct ctt act ttc ggt 
10 Leu Phe Ala Trp Leu Phe Pro Arg Cys Thr Val Pro Leu Thr Phe Gly 
35 40 45 50 

ttt gaa aac atg agg ggg ttg ggc gtg gta get tac gcc tgt aat ccc 
Phe Glu Asn Met Arg Gly Leu Gly Val Val Ala Tyr Ala Cys Asn Pro 
55 60 65 

15 age act tagggaggee gaggegggag gatggcttga ggtccgtagt tgagaccagc 
Ser Thr 

ctggccaaca tggtgaagcc tggtctctac aaaaaaataa taacaaaaat tagcegggtg 
tggtggctcg tgcctgtggt cccagctgct ccggtggctg aggegggagg atctcttgag 
cttaggcttt tgagctatca tggcgccagt gcactccagc gtgggcaaca gagegagace 
20 ctgtctctca aaaacaaaaa aaaaaaaaaa aa 



147 



195 



243 



291 



347 

407 
467 
527 
559 



<210> 125 
<211> 744 
<212> DNA 
25 <213> Homo sapiens 



30 



35 



<220> 
<221> CDS 
<222> 216. 



635 



<220> 

<221> sig_peptide 

<222> 216 . . 335 

<223> Von Heijne matrix 

score 4.38054120608596 
seq ITLVSAAPGKVIC/EM 



<400> 125 

gcttcgtcac aagggtgcga tgaaagtcag tgagcaaatc gcggaccacc ggggctgeca 
40 gctcgcctga ctcccggcct cttgcgctcc taggggegga gaagggtgcg ggctcttcgc 
cctttgtgtc ctccttcttt cactaacttc tggactttcc agctcttccg aagttcgttc 
ttgcgcaaag cccaaaggct ggaaaaccgt ccacg atg acc age atg act cag 

Met Thr Ser Met Thr Gin 
-40 -35 
45 tct ctg egg gag gtg ata aag gcc atg acc aag get cgc aat ttt gag 
Ser Leu Arg Glu Val He Lys Ala Met Thr Lys Ala Arg Asn Phe Glu 

-30 -25 -20 

aga gtt ttg gga aag att act ctt gtc tct get get cct ggg aaa gtg 
Arg Val Leu Gly Lys He Thr Leu Val Ser Ala Ala Pro Gly Lys Val 
50 -15 -10 -5 

att tgt gaa atg aaa gta gaa gaa gag cat acc aat gca ata ggc act 
He Cys Glu Met Lys Val Glu Glu Glu His Thr Asn Ala He Gly Thr 

1 5 10 

etc cac ggc ggt ttg aca gcc acg tta gta gat aac ata tea aca atg 
55 Leu His Gly Gly Leu Thr Ala Thr Leu Val Asp Asn He Ser Thr Met 
15 20 25 30 

get ctg eta tgc acg gaa agg gga gca ccc gga gtc agt gtc gat atg 
Ala Leu Leu Cys Thr Glu Arg Gly Ala Pro Gly Val Ser Val Asp Met 
3 5 4 0 4 5 

60 aac ata acg tac atg tea cct gca aaa tta gga gaa gat ata gtg att 
Asn He Thr Tyr Met Ser Pro Ala Lys Leu Gly Glu Asp He Val He 

50 55 60 

aca gca cat gtt ctg aag caa gga aaa aca ctt gca ttt acc tct gtg 



60 
120 
180 

233 



281 



329 



377 



425 



473 



521 



569 



134 



BNSDOCID: <WO 0142451A2J_> 



WO 01/42451 



PCT/IBOO/01938 



10 



Thr Ala His Val Leu Lys Gin Gly Lys Thr Leu Ala Phe Thr Ser Val 

65 70 75 

gat ctg acc aac aag gcc aca gga aaa tta ata gca caa gga aga cac 
Asp Leu Thr Asn Lys Ala Thr Gly Lys Leu lie Ala Gin Gly Arg His 

80 85 90 

aca aaa cac ctg gga aac tgagagaaca gcagaatgac ctaaagaaac 
Thr Lys His Leu Gly Asn 
95 100 

ccaacaatga atatcaagta tagatttgac tcaaacaatt gtaatttttg aaataaacta 
gcaaaaaaaa aaaaaaaaa 



617 



665 



725 
744 



<210> 126 
<211> 824 
<212> DNA 
15 <213> Homo sapiens 



20 



25 



<220> 
<221> CDS 
<222> 164 



280 



<220> 

<221> sig_peptide 

<222> 164 . .268 

<223> Von Heijne matrix 

score 5.73290676305402 
seq TLPLCPVTSPVWG/WS 



<400> 126 

tgtgttcaat cgtgtgaatg gccggcgggc cccctccacg tccccatcct tcgaggggac 
30 ccaggagacc tacacagtgg cccacgagga gaatgtccgc tttgtgtccg aaggtagcga 
gcggggccag agggtgcggc ataggctgct gggtcgcaaa acc atg gac ccg gga 

Met Asp Pro Gly 
-35 

tgg ccc cac ttc aag ctg acc cac age cgc tgc atg get gtg ctt ttc 
35 Trp Pro His Phe Lys Leu Thr His Ser Arg Cys Met Ala Val Leu Phe 
-30 -25 -20 

ctt ggc act ctg ccc ttg tgt cct gtg acc age cct gtg tgg ggc tgg 
Leu Gly Thr Leu Pro Leu Cys Pro Val Thr Ser Pro Val Trp Gly Trp 
-15 -10 -5 1 

40 agt cca ggg tgaccatcag gccctgggtg ggcgatgggg tgcctgggac 
Ser Pro Gly 

ctggctcagc ccgactgccc tcctcccaca gcctggcagc aggtgeaaca gcagctggat 
ggtggcccag ccggtgaggg egggecaagg cctgtgcagt acgtggagag gacccccaat 
ccccggctgc agaactttgt gcccatttac ctagacgagt ggtgggcgca gcagttcctg 

45 gegagaatea ccagctgttc ctagtggctg ctgggagggg gcgctgctac acggccgacc 
tgtcgecagg agagaagcat ggcgccctgc ccacccactg cgcctggctg ggtgccggcc 
acacctgaag tgecagcatt tggacttttg cacctttttt tcccttggcc cggctgtccc 
aaccaagctg ccatggccaa gggccgaacc cgtctgacct cagccctgct cactgtgccc 
agggaccagc gaccagcccc tggggctggc agggaggagc tccaggctaa taaagtggag 

50 aaactgtcaa aaaaaaaaaa aaaa 



60 
120 
175 



223 



271 



320 

380 
440 
500 
560 
620 
680 
740 
800 
824 



<210> 127 

<211> 526 

<212> DNA 

55 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 68 . . 301 



60 



<220> 

<221> sig_peptide 
<222> 68 . . 190 



135 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 



<223> Von Heijne matrix 

score 4.68908216483476 
seq AYLLYILLTGALQ/FG 

5 <400> 127 

acatccggtg tggtcgacgg gtcctccaag agtttggggc 
tgcagtt atg teg gcg teg gta gtg tct gtc att 
Met Ser Ala Ser Val Val Ser Val He 
-40 -35 
10 gag tac ttg age tec act ccg cag cgt ctg aag 
Glu Tyr Leu Ser Ser Thr Pro Gin Arg Leu Lys 

-25 -20 
ctg ctg tat ata ctg ctg ace ggg gcg ctg cag 
Leu Leu Tyr He Leu Leu Thr Gly Ala Leu Gin 
15 -10 ' -5 

etc gtg ggg acc ttc ccc ttc aac tct ttt etc 
Leu Val Gly Thr Phe Pro Phe Asn Ser Phe Leu 

10 15 
tgt gtg ggg agt ttc ate eta gcg ggt tea etc 
20 Cys Val Gly Ser Phe He Leu Ala Gly Ser Leu 
25 3 0 

taagagttct ggagatggca gcttattgga cacatggatt 
tgetagctet gctttttatg caggagaaaa geccagagtt 
ttctaacaaa catttattaa tccagcctct gectttcatt 
25 caaattaaag aactccatgc cactcctcaa aaaaaaaaaa 



gcggaccgga gtaccttgcg 
teg egg ttc tta gaa 
Ser Arg Phe Leu Glu 
-30 

ttg ctg gac gcg tac 
Leu Leu Asp Ala Tyr 
-15 

ttc ggt tac tgt etc 
Phe Gly Tyr Cys Leu 
1 5 
teg ggc ttc ate tct 
Ser Gly Phe He Ser 
20 

ttt gaa ttt cct gga 
Phe Glu Phe Pro Gly 
35 

ttcttcagat ttgeacttae 
cactgtgtgt cagaacaact 
aaatgtaacc ttttgectte 
aaaaa 



60 
109 



157 



205 



253 



301 



361 
421 
481 
526 



<210> 128 
<211> 618 
<212> DNA 
30 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 179- 



427 



35 



40 



45 



50 



55 



60 



<220> 

<221> sig_peptide 

<222> 179. .298 

<223> Von Heijne matrix 

score 7.72883276007822 

seq CLVWTMATLSLA/RP 

<400> 128 

aagegaagag atgggtctgc actttggagg agccggacac tgttgacttt cctgatgtga 
aatctaccca ggaacaaaac accagtgact gcagcagcag cggcagcgcc tcggttcctg 
agcccaccgc aggctgaagg cattgegegt agtccatgcc cgtagaggaa gtgtgcag 
atg gga tta acg tec aca tgg aga tat gga aga gga ccg ggg att ggt 
Met Gly Leu Thr Ser Thr Trp Arg Tyr Gly Arg Gly Pro Gly He Gly 
-40 -35 -30 -25 

acc gta acc atg gtc age tgg ggt cgt ttc ate tgc ctg gtc gtg gtc 
Thr Val Thr Met Val Ser Trp Gly Arg Phe He Cys Leu Val Val Val 

-20 -15 -10 

acc atg gca acc ttg tec ctg gee egg ccc tec ttc agt tta gtt gag 
Thr Met Ala Thr Leu Ser Leu Ala Arg Pro Ser Phe Ser Leu Val Glu 

-5 15 
gat acc aca tta gag cca gaa gat gec ate tea tec gga gat gat gag 
Asp Thr Thr Leu Glu Pro Glu Asp Ala He Ser Ser Gly Asp Asp Glu 

10 15 20 

gat gac acc gat ggt gcg gaa gat ttt gtc agt gag aac agt aac aac 
Asp Asp Thr Asp Gly Ala Glu Asp Phe Val Ser Glu Asn Ser Asn Asn 
25 30 35 40 

aag agt aag taactgcccg gctccgatgg tccccgagag aggagcatgg 
Lys Ser Lys 

136 



60 
120 
178 
226 



274 



322 



370 



418 



467 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/IBOO/01938 



agggaagttc tgcctgtcac ctgtcttctt gtcgactctt ctgcgccatg ctgtgtcccg 
cggcccttgc ctttccccgc tgtgtctact ttcctgactt tcaaacctga gaataaacca 
gtgttgctgc acataaaaaa aaaaaaaaaa a 

5 <210> 129 
<211> 776 
<212> DNA 
<213> Homo sapiens 

10 <220> 

<221> CDS 
<222> 22 . .297 



527 
587 
618 



<220> 

<221> sig_peptide 
<222> 22 . .66 
<223> Von Heijne matrix 

score 4.68058603039206 
seq VLAGSLLGPTSRS/AA 

<400> 129 

actgcgggac ccactgcgga t atg get gtc ttg get gga tec ctg ttg ggc 

Met Ala Val Leu Ala Gly Ser Leu Leu Gly 
-15 -10 
ccc acg agt agg teg gca gcg ttg ctg ggt ggc agg tgg etc cag ccc 
Pro Thr Ser Arg Ser Ala Ala Leu Leu Gly Gly Arg Trp Leu Gin Pro 
-5 1 5 10 

egg gee tgg ctg ggg ttc cca gac gec tgg ggc etc ccc ace ccg cag 
Arg Ala Trp Leu Gly Phe Pro Asp Ala Trp Gly Leu Pro Thr Pro Gin 

15 20 25 

cag gec egg ggc aag get cgc ggg aat gag tat cag ccg age aat ate 
Gin Ala Arg Gly Lys Ala Arg Gly Asn Glu Tyr Gin Pro Ser Asn lie 

30 35 40 

aaa cgc aag aac aag cac ggc tgg gtc egg cgc ctg age acg ccg gee 
Lys Arg Lys Asn Lys His Gly Trp Val Arg Arg Leu Ser Thr Pro Ala 

45 50 55 

ggc gtg cag gtc ate ctt cgc cga atg etc aag ggc cgc aag teg ctg 
Gly Val Gin Val He Leu Arg Arg Met Leu Lys Gly Arg Lys Ser Leu 
60 65 70 75 

age cat tgaggatege gaegcagteg gcggggaccc tcatggaagc atcgccctcg 
Ser His 

cctcggacct tgcctggcgc tatttttgea gggagctggg gagcaggaac gcctcggacc 
tgagtgctct ccatattgtg ggtttgaagt ctggatggga gccttgccaa gtcccttttt 
aggcttttta attaggaagc atttcgaacc tgcgcaacag accaaagaac agtacaaaga 
acatccgtgt acccagtacc ctgactaccg actacctaca acccgtccct gccccatcct 
gagttctttt gaagctgatc tcaggcatcg gattatttct tctgtaaata tttcagaatg 
tatctctcca agatgagagc tcattaaaag ataattacaa agcttatcac atccaaaaga 
attatcaata attttgaaat attattaaac gtgtaataaa tgttcaaagt tcaaaaaaaa 
aaaaaaaaa 

<210> 130 

<211> 998 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 9. . 845 

60 <220> 

<221> sig_peptide 

<222> 9 . . 134 

<223> Von Heijne matrix 

137 



15 



20 



25 



30 



35 



40 



45 



50 



55 



51 



99 



147 



195 



243 



291 



347 

407 
467 
527 
587 
647 
707 
767 
776 



BNSDOCID: <WO 0142451 A2_l_> 
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score 6.13963522287438 
seq RSLALAAAPSSNG/SP 

<400> 130 

5 aacgaaag atg gcg gcg ccc gta agg egg acg ctg tta ggg gtg gcg ggg 50 
Met Ala Ala Pro Val Arg Arg Thr Leu Leu Gly Val Ala Gly 
-40 -35 -30 

ggt tgg egg egg ttc gag agg etc tgg gee ggc agt eta age tct cgc 98 
Gly Trp Arg Arg Phe Glu Arg Leu Trp Ala Gly Ser Leu Ser Ser Arg 
10 -25 -20 -15 

age ctg get ctt gca gee gca ccc tea age aac gga tec cca tgg cgc 146 
Ser Leu Ala Leu Ala Ala Ala Pro Ser Ser Asn Gly Ser Pro Trp Arg 

-10 -5 1 

ttg ttg ggc gcg ttg tgc ctg cag egg cca cct gta gtc tec aag ccg 194 
15 Leu Leu Gly Ala Leu Cys Leu Gin Arg Pro Pro Val Val Ser Lys Pro 
5 10 15 20 

ttg acc cca ttg cag gaa gag atg gcg tct eta ctg cag cag att gag 242 
Leu Thr Pro Leu Gin Glu Glu Met Ala Ser Leu Leu Gin Gin lie Glu 
25 30 35 

20 ata gag aga age ctg tat tea gac cac gag ctt cgt get ctg gat gaa 290 
lie Glu Arg Ser Leu Tyr Ser Asp His Glu Leu Arg Ala Leu Asp Glu 

40 45 50 

aac cag cga ctg gca aag aag aaa get gac ctt cat gat gaa gaa gat 338 
Asn Gin Arg Leu Ala Lys Lys Lys Ala Asp Leu His Asp Glu Glu Asp 
25 55 60 65 

gaa cag gat ata ttg ctg gcg caa gat ttg gaa gat atg tgg gag cag 3 86 

Glu Gin Asp lie Leu Leu Ala Gin Asp Leu Glu Asp Met Trp Glu Gin 

70 75 80 

aaa ttt eta cag ttc aaa ctt gga get cgc ata aca gaa get gat gaa 434 
30 Lys Phe Leu Gin Phe Lys Leu Gly Ala Arg lie Thr Glu Ala Asp Glu 
85 90 95 100 

aag aat gac cga aca tec ctg aac agg aac eta gac agg aac ctt gtc 482 
Lys Asn Asp Arg Thr Ser Leu Asn Arg Asn Leu Asp Arg Asn Leu Val 
105 110 115 

35 ctg tta gtc aga gag aag ttt gga gac cag gat gtt tgg ata ctg ccc 530 
Leu Leu Val Arg Glu Lys Phe Gly Asp Gin Asp Val Trp lie Leu Pro 

120 125 130 

cag gca gag tgg cag cct ggg gag acc ctt cga gga aca get gaa cga 578 
Gin Ala Glu Trp Gin Pro Gly Glu Thr Leu Arg Gly Thr Ala Glu Arg 
40 135 140 145 

acc ctg gee aca etc tea gaa aac aac atg gaa gee aag ttc eta gga 626 
Thr Leu Ala Thr Leu Ser Glu Asn Asn Met Glu Ala Lys Phe Leu Gly 

150 155 160 

aat gca ccc tgt ggg cac tac aca ttc aag ttc ccc cag gca atg egg 674 
45 Asn Ala Pro Cys Gly His Tyr Thr Phe Lys Phe Pro Gin Ala Met Arg 
165 170 175 180 

aca gag agt aac etc gga gee aag gtg ttc ttc ttc aaa gca ctg eta 722 
Thr Glu Ser Asn Leu Gly Ala Lys Val Phe Phe Phe Lys Ala Leu Leu 
185 190 195 

50 tta act gga gac ttt tec cag get ggg aat aag ggc cat cat gtg tgg 770 
Leu Thr Gly Asp Phe Ser Gin Ala Gly Asn Lys Gly His His Val Trp 

200 205 210 

gtc att aag gat gag ctg ggt gac tat ttg aaa cca aaa tac ctg gee 818 
Val lie Lys Asp Glu Leu Gly Asp Tyr Leu Lys Pro Lys Tyr Leu Ala 
55 215 220 225 

caa gtt agg agg ttt gtt tea gac etc tgatgggccg agetgectgt 865 
Gin Val Arg Arg Phe Val Ser Asp Leu 

230 235 
ggacggtgct cagacaagtc tgggattaga gectcaagga cattgtgtga ttgcctcaca 925 
60 tttgcaggta atatcaagca gcaaactaaa ttctgagaaa taaacgagtc tattaccaaa 985 
aaaaaaaaaa aaa 998 

<c210> 131 

138 



BNSDOCID: <WO 0142451A2_I_> 



WO 01/42451 



PCT/1B00/01938 



<211> 779 
<212> DNA 
<213> Homo sapiens 

5 <220> 

<221> CDS 
<222> 27 . . 578 

<220> 

10 <221> sig^ peptide 
<222> 27. . 119 
<223> Von Heijne matrix 

score 4.50637135496675 

seq TALMVG AAS LLEG / R P 

15 

<400> 131 

atctttctgg actggccctg cagagg atg gca tgc acc act act gcc ccc gcc 53 

Met Ala Cys Thr Thr Thr Ala Pro Ala 
-30 -25 
20 cag gaa cac atg ctt etc acc cct etc act get ctg atg gtg ggg get 101 
Gin Glu His Met Leu Leu Thr Pro Leu Thr Ala Leu Met Val Gly Ala 

-20 -15 -10 

get tct ctg ctt gag ggc egg cca cag ate tea get cca tac tec cga 149 
Ala Ser Leu Leu Glu Gly Arg Pro Gin lie Ser Ala Pro Tyr Ser Arg 
25-5 1 5 10 

get gca tgt tgc age cct ggg gca ctg gga tgt cct gca get egg gtt 197 
Ala Ala Cys Cys Ser Pro Gly Ala Leu Gly Cys Pro Ala Ala Arg Val 

15 20 25 

ggg att ctg gat ctg atg tat tec tgg gtt gcc agg aaa gtg etc agg 245 
30 Gly lie Leu Asp Leu Met Tyr Ser Trp Val Ala Arg Lys Val Leu Arg 
30 35 40 

tgc age aat act ggg ctg cag ggg ctg cac tgt gca cca get tat gca 293 
Cys Ser Asn Thr Gly Leu Gin Gly Leu His Cys Ala Pro Ala Tyr Ala 
45 50 55 

35 gca cag ctt ggt atg gac cct ggg agg ggc caa cga gca gga ggg cct 341 
Ala Gin Leu Gly Met Asp Pro Gly Arg Gly Gin Arg Ala Gly Gly Pro 

60 65 70 

gta gag cag aca tac ttc agt ccc atg ggg aag ctg ccc act ctt teg 389 
Val Glu Gin Thr Tyr Phe Ser Pro Met Gly Lys Leu Pro Thr Leu Ser 
40 75 80 85 90 

tgg ctg gaa ggc tgt aca gca gtc atg acg ctg gca tct get tgg ctt 437 
Trp Leu Glu Gly Cys Thr Ala Val Met Thr Leu Ala Ser Ala Trp Leu 

95 100 105 

ctg ggg age cct egg gaa act tac aat cat gag aag gtg aag gag aag 485 
45 Leu Gly Ser Pro Arg Glu Thr Tyr Asn His Glu Lys Val Lys Glu Lys 
110 115 120 

cag tgt cca ttc tec agt atg gtt ttg ggg gag tat ggc ttc eta cct 533 
Gin Cys Pro Phe Ser Ser Met Val Leu Gly Glu Tyr Gly Phe Leu Pro 
125 130 135 

50 act gtg gac cac ctg tea act ctg ggc tgt aac atg aga gaa ttg 578 
Thr Val Asp His Leu Ser Thr Leu Gly Cys Asn Met Arg Glu Leu 

140 145 150 

tgaacttctg tcttgtttga gccatggttt cattctcttt ttcagccatg tagcctgtgc 638 
tgtaactcag taccacatta gcaactagtg aaagtcaatg tgggtaaatt tgtcattctt 698 
55 caggttagaa catttcttcc ttttattctt gtgtttttgg ctaaataaac tgggaaatta 758 
tagtaaaaaa aaaaaaaaaa a 779 

<210> 132 
<211> 1025 
60 <212> DNA 

<213> Homo sapiens 



<220> 



139 



BNSDOCID: <WO_01 42451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



<221> CDS 
<222> 408 . .710 

<220> 

5 <221> sig_peptide 
<222> 408 . . 533 
<223> Von Heijne matrix 

score 5.66440183652506 
seq QLCFHLSWLYSWA/SQ 

10 

<400> 132 

atggtttgtt gtgagttcca tgtcctcttg gatcagtcac tgtggccatg catgtttggc 60 
cacatgatta atccagtctg ggtcatgacc ttttcttcat ccaaaacaag gtgatgggaa 120 
gacaaaaaca atagctacta caaacaatag gagtttataa ttatgtgctg atgtattcga 180 
15 agatgtgttg acagtcgtga gtgtgtatcc taggaaaggc gagctggact ctgtctccat 240 
ggtggctctc accccaggga cctaggaaca gcctgtcacc acacaattac ttttataacc 300 
ctggagatga aaatctcctt gtcctcaaaa tacttccaga agaacaacca gatgggaagg 360 
accttggttg ggactctttc cagttcactt ggggcagagg gaattta atg get cac 4 16 

Met Ala His 

20 -40 

gta get gaa aag gat ggg eta gat tgg get tea ggc tgc ate cca gga 464 
Val Ala Glu Lys Asp Gly Leu Asp Trp Ala Ser Gly Cys lie Pro Gly 

-35 -30 -25 

etc caa aca ggg ate tgt etc ttt ggc tct cag etc tgc ttt cat ttg 512 

25 Leu Gin Thr Gly lie Cys Leu Phe Gly Ser Gin Leu Cys Phe His Leu 
-20 -15 . -10 

agt tgg c ^t tat tct tgg get tea cag tgt ggc ccc aca gca cca gtt 560 
Ser Trp Leu Tyr Ser Trp Ala Ser Gin Cys Gly Pro Thr Ala Pro Val 
-5 1 5 

30 att gat aaa aag age tec cct ttg ctg aca gaa ctg ctg gat ttg gtt 608 
lie Asp Lys Lys Ser Ser Pro Leu Leu Thr Glu Leu Leu Asp Leu Val 
10 15 20 25 

etc att ggt cca gac gag gaa ggt ate cag cct caa gtc ate att gtg 656 
Leu lie Gly Pro Asp Glu Glu Gly lie Gin Pro Gin Val lie lie Val 

35 30 35 40 

gee agg aag atg gaa tac acc aaa tgg aca ggc ctg gca tgt ace cac 704 
Ala Arg Lys Met Glu Tyr Thr Lys Trp Thr Gly Leu Ala Cys Thr His 

45 50 55 

aga gac tgagagttgg tgctggtggt tgtggtggca gatgatatta cctgaagaag 760 

40 Arg Asp 

ggacgaatgg gtgctgggca ggacaaagca tcagctgtcc agttcaggcc tctcctcttt 820 
ccctggtgtc ttcattttcc tccgtctccc tgctgtccct taccctctgc ccaatctcat 880 
tactcctggt cttgggagtt gecttctgag gatactccac tgggggtacc tgagcctgga 940 
ttagagggca gggggaggat attgectage caaagtgggt gttcaataaa gaaccatttg 1000 
45 gagatggcaa aaaaaaaaaa aaaaa 1025 

<210> 133 
<211> 607 
<212> DNA 
50 <213> Homo sapiens 



55 



60 



<220> 
<221> CDS 
<222> 247. 



501 



<220> 

<221> sig_peptide 

<222> 247 . .306 

<223> Von Heijne matrix 

score 6.43040298500966 
seq LLLVTLVASTVPG/NS 



<400> 133 



140 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 
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30 



tgttacaaat attccctatg atctctcctt taaatattct tatcaggata ttggaaattc 60 
ttgattttca caactctgct tcagtggcat atgtttagct ttttgtcttc tgaattaatt 120 
gggcttctga tggtccctag aggtatcagc tactcagtca gaaaacatac atggggaaga 180 
aactgaagtt catgccacaa actgtagcag ctttggaaca gaagggacca gacaacctca 240 
5 aggaga atg ggc cca aat acc aaa aat tta etc ttg gtg acc ctt gtt 288 
Met Gly Pro Asn Thr Lys Asn Leu Leu Leu Val Thr Leu Val 
-20 -15 -10 

get tct act gta cca ggc aac tct ctt ggg cag gat ttt act ttt gca 336 
Ala Ser Thr Val Pro Gly Asn Ser Leu Gly Gin Asp Phe Thr Phe Ala 
10-5 1 5 10 

cac tta gaa aga tec tgc acc agg gaa aat egg tct cct ggg gag gta 3 84 

His Leu Glu Arg Ser Cys Thr Arg Glu Asn Arg Ser Pro Gly Glu Val 

15 20 25 

ttc cag caa cca tgc aag tct gga ggc ggg ggg gtt gga gaa cca aat 432 
15 Phe Gin Gin Pro Cys Lys Ser Gly Gly Gly Gly Val Gly Glu Pro Asn 
30 35 40 

gec caa ggg cag eta ctt age cag cac cca eta cct gee ttc att aat 4 80 

Ala Gin Gly Gin Leu Leu Ser Gin His Pro Leu Pro Ala Phe lie Asn 
45 50 55 

20 tgt tct cac ggg cag gec ttt tgaaccaccc tggtacagaa caccaaccct 531 
Cys Ser His Gly Gin Ala Phe 

60 65 
ggtgctttag gctgtctgtg ccatttctag gcaatgaacg agtagttact gtaccaaccc 591 
aaaaaaaaaa aaaaaa 607 

25 

<210> 134 
<211> 774 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 333 . .602 

35 <220> 

<221> sigpeptide 

<222> 333 . .416 

<223> Von Heijne matrix 

score 4.79986448293481 
40 seq VPALPLLSSLCMA/MV 

<400> 134 

ctcttcagtc eggggcttgg ttgaaeggae tcaccaggaa acgtgacttt cgtgtccgac 60 
ctctgctgta tcaggattcg attcttggtg ttaaacaaga caaegctgaa ggctcggtgc 120 

45 agcagccctg caaaggtttt tccagcgctc ttgggaggtg ggctgtgccc tgcctggccc 180 
acctggccca cctggcccac cattacctga agggaagcat gaacagcett tgacgtggga 240 
gtggcgactg ctgagaggga actgtctgta cacaagcaat gtagecttat gggacctgag 300 
tggagcccca acccacgcag ggcgtgktct tc atg get ttt cct ggc caa tct 353 

Met Ala Phe Pro Gly Gin Ser 

50 -25 

gat acc aag atg cag tgg cca gaa gta cct gca ctt cca etc ctg tea 401 
Asp Thr Lys Met Gin Trp Pro Glu Val Pro Ala Leu Pro Leu Leu Ser 

-20 -15 -10 

agt etc tgc atg get atg gtg agg aag age tct gca ctg ggc aag gaa 44 9 

55 Ser Leu Cys Met Ala Met Val Arg Lys Ser Ser Ala Leu Gly Lys Glu 
-5 15 10 

gtt ggc cgt cga gtg aag gaa atg gtg atg ctg gtg gee cct ttc egg 497 
Val Gly Arg Arg Val Lys Glu Met Val Met Leu Val Ala Pro Phe Arg 
15 20 25 

60 cag tea agt tec eta tea agg aca ttc agt tct egg aaa gtg gtg aag 545 
Gin Ser Ser Ser Leu Ser Arg Thr Phe Ser Ser Arg Lys Val Val Lys 

30 35 40 

gca cat get tec ctg cat ggt gec cgc etc tct cca etc tct aga aat 593 
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Ala His Ala Ser Leu His Gly Ala Arg Leu Ser Pro Leu Ser Arg Asn 

45 50 55 

att aga ggc taggctgctg ctgtatgtca gggctagtcc ctcttctatg 642 
lie Arg Gly 
5 60 

aatccagaat aactctgaag aagccgagta acaggcatga agtgaagaga aatcgctgta 702 
acaggaagac agcaaagcag atgctaatga ccacactatt taacgaactg gaaccaacaa 762 
aaaaaaaaaa aa 774 

10 <210> 135 

<211> 611 

<212> DNA 

<213> Homo sapiens 

15 <220> 

<221> CDS 
<222> 110. .376 

<220> 

20 <221> sig_peptide 
<222> 110. .208 
<223> Von Heijne matrix 

score 3.64796206065748 
seq LVPHSPLPGALSS/AP 



25 



30 



<220> 

<221> misc_feature 
<222> 347 

<223> n=a, g, c or t 



<400> 135 

tcttgtcaac actgcccact cagcgaggaa gcagccgcga cgcccacact tcctgttgga 60 
gcctgcgcag agccagaggc ctcagaagcc acaggaacat ggcctaggc atg get cag 118 

Met Ala Gin 

35 cca gca gec ccc tec ctg acg egg ccc ttc ctg gca gag gee ccg aca 166 
Pro Ala Ala Pro Ser Leu Thr Arg Pro Phe Leu Ala Glu Ala Pro Thr 
-30 -25 -20 -15 

gca ctg gtc cca cac age ccc ctg cct ggg gee ctg tea age gee cct 214 
Ala Leu Val Pro His Ser Pro Leu Pro Gly Ala Leu Ser Ser Ala Pro 

40 -10 -5 1 

ggc ccg aag cag ccc ccg acg gca age aca ggc ccg gag ctg ctg ctg 262 
Gly Pro Lys Gin Pro Pro Thr Ala Ser Thr Gly Pro Glu Leu Leu Leu 

5 10 15 

ctg cct ctt tec tec ttc atg ccc tgc ggg gcg get gca cca gee agg 310 

45 Leu Pro Leu Ser Ser Phe Met Pro Cys Gly Ala Ala Ala Pro Ala Arg 
20 25 30 

gtg tea tea cag egg get act cct agg gat aag ccc ncc ggt ccc etc 358 
Val Ser Ser Gin Arg Ala Thr Pro Arg Asp Lys Pro Xaa Gly Pro Leu 
35 40 45 50 

50 ate cct ggc cag tgt ccc tgacccccat ctactccttc ctggggactt 406 
lie Pro Gly Gin Cys Pro 
55 

ctcagcgcca gcccattggc gcctgcgttg cccgcatcca ggccctgcgg caggccctgt 466 
gctagcgtgt tcgcaccagg aacgeaggtg ctgggctgtc ggggaggect caggccacct 526 
55 ccaggaacag aacacagttt taagtttgat tttttttatt teaaaatget ttgcaattaa 586 
atgaattact gttcaaaaaa aaaaa 611 

<210> 136 
<211> 925 
60 <212> DNA 

<213> Homo sapiens 

<220> 
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10 



<221> CDS 
<222> 22 . . 417 

<220> 

<221> sig_peptide 

<22 2> 22 . . 66 

<223> Von Heijne matrix 

score 5.47092708754574 
seq RVLCAPAAGAVRA/LR 



<400> 136 

agtcgaggag tcaaggcagc a atg aat cgt gtc ttg tgt gcc ccg gcg gcc 51 

Met Asn Arg Val Leu Cys Ala Pro Ala Ala 
-15 -10 

15 ggg gcc gtc egg gcg ctg agg etc ata ggc tgg get tec cga age ctt 99 
Gly Ala Val Arg Ala Leu Arg Leu lie Gly Trp Ala Ser Arg Ser Leu 
-5 15 10 

cat ccg ttg ccc ggt tec egg gat egg gcc cac cct gcc gcc gag gaa 147 
His Pro Leu Pro Gly Ser Arg Asp Arg Ala His Pro Ala Ala Glu Glu 

20 15 ~ 20 25 

gag gac gac cct gac cgc ccc att gag ttt tec tec age aaa gcc aac 195 
Glu Asp Asp Pro Asp Arg Pro lie Glu Phe Ser Ser Ser Lys Ala Asn 

30 35 40 

cct cac cgc tgg teg gtg ggc cat ace atg gga aag gga cat cag egg 243 

25 Pro His Arg Trp Ser Val Gly His Thr Met Gly Lys Gly His Gin Arg 
45 50 55 

ccc tgg tgg aag gtg ctg ccc etc age tgc ttc etc gtg gcg ctg ate 291 
Pro Trp Trp Lys Val Leu Pro Leu Ser Cys Phe Leu Val Ala Leu lie 
60 65 70 75 

30 ate tgg tgc tac ctg agg gag gag age gag gcg gac cag tgg ttg aga 339 
lie Trp Cys Tyr Leu Arg Glu Glu Ser Glu Ala Asp Gin Trp Leu Arg 

80 85 90 

cag gtg tgg gga gag gtg cca gag ccc agt gat cgt tct gag gag cct 387 
Gin Val Trp Gly Glu Val Pro Glu Pro Ser Asp Arg Ser Glu Glu Pro 

35 95 100 105 

gag act cca get gcc tac aga gcg aga act tgacggggtg cccgctgggg 437 
Glu Thr Pro Ala Ala Tyr Arg Ala Arg Thr 

110 115 
ctggcaggaa gggagecgae agccgccctt eggatttgat gtcacgtttg cccgtgactg 497 

40 tcctggctat gcgtgcgtcc tcagcactga aggacttggc tggtggatgg ggcacttggc 557 
tatgetgatt cgcgtgaagg eggagcagaa tctcagcaga teggaaactg ctcctcgcct 617 
ggctcttgat gtccaaggat tccatcggca agacttctca gatccttggg gaaggtttca 677 
gttgcactgt atgctgttgg atttgecaag tctttgtata acataatcat gtttccaaag 737 
cacttctggt gacacttgtc atccagtgtt agtttgcagg taatttgett tctgagatag 797 

45 aatatctggc agaagtgtga aactgtattg catgetgegg cctgtgcaag gaacacttcc 857 
acatgtgagt tttacacaac aacaaatgaa aataaatttt aattttataa taaaaaaaaa 917 
aaaaaaaa 325 

<210> 137 
50 <211> 674 
<212> DNA 

<213> Homo sapiens 

<220> 
55 <221> CDS 

<222> 62 . . 367 

<220> 

<221> sig_peptide 
60 <222> 62 . .103 

<223> Von Heijne matrix 

score 4.49063834776683 

seq FLTALLWRGR I PG / RQ 
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<400> 137 

acgccacggc gtctgctggc ggccgcggag acgcagagtc ttgagcagcg cggcaggcac 60 
c atg ttc ctg act gcg etc etc tgg cgc ggc cgc att ccc ggc cgt cag 109 
5 Met Phe Leu Thr Ala Leu Leu Trp Arg Gly Arg lie Pro Gly Arg Gin 

-10 -5 1 

tgg ate ggg aag cac egg egg ccg egg ttc gtg teg ttg cgc gee aag 157 
Trp lie Gly Lys His Arg Arg Pro Arg Phe Val Ser Leu Arg Ala Lys 
5 10 15 

10 cag aac atg ate cgc cgc ctg gag ate gat gcg gag aac cat tac tgg 205 
Gin Asn Met lie Arg Arg Leu Glu He Asp Ala Glu Asn His Tyr Trp 

20 25 30 

ctg age atg ccc tac atg ace egg gag cag gag cgc ggc cac gee gsg 253 
Leu Ser Met Pro Tyr Met Thr Arg Glu Gin Glu Arg Gly His Ala Xaa 
15 35 40 45 50 

dtg cgc agg agg gag gec ttc gag gee ata aag gcg gee gee act tec 301 
Xaa Arg Arg Arg Glu Ala Phe Glu Ala He Lys Ala Ala Ala Thr Ser 

55 60 65 

aag ttc ccc ccg cat aga ttc att gcg gac cag etc gac cat etc aat 349 
20 Lys Phe Pro Pro His Arg Phe He Ala Asp Gin Leu Asp His Leu Asn 
70 75 80 

gtc acc aag aaa tgg tec taatcctgag tagtcaccct tggattttat 397 
Val Thr Lys Lys Trp Ser 
85 

25 ggatcacgga gctgaccatc tttacctggt cctggaactg aaaaactgta gcttgtgtga 457 
aaatgagect ttggaccagt ctttattaaa acaaacaaac atgagtagtc tgeatatega 517 
atatctagag ctctaaaccc cccaatactt aaaagtctaa ttgctgtcct gtggtttcat 577 
tagtctgata ggaagatagg gatttcctca gtcacagatg atattttgaa ggaaagctgc 637 
aataaageca caatgattcg aaaaaaaaaa aaaaaaa 674 

30 

<210> 138 
<211> 1725 
<212> DNA 

<213 > Homo sapiens 

<220> 

<221> CDS 

<222> 107 . . 1618 

40 <220> 

<221> sig_peptide 

<222> 107. .178 

<223> Von Heijne matrix 

score 6.19650168602189 
45 seq LGLYSLVLSGALA/YA 

<400> 138 

agagctcagc cggtcgcacg gaeggacagt tggaagcegg accccagagc ctgaggtggg 60 
cagtgtgcca gggtcccttg cggcctcctc aagccctgtc caggct atg ggc ate 115 

50 ~ " Met Gly He 

aag aca gca ttg ccg gcg get gag ctg ggc etc tac tct ctg gtg ctg 163 
Lys Thr Ala Leu Pro Ala Ala Glu Leu Gly Leu Tyr Ser Leu Val Leu 

-20 -15 -10 

agt ggg gee ctg gee tat get ggc egg ggc etc ctt gag get tea caa 211 

55 Ser Gly Ala Leu Ala Tyr Ala Gly Arg Gly Leu Leu Glu Ala Ser Gin 
-5 15 10 

gat ggg gee cac agg aag gee ttc egg gag tct gtg cga cct ggc tgg 259 
Asp Gly Ala His Arg Lys Ala Phe Arg Glu Ser Val Arg Pro Gly Trp 
15 20 25 

60 gag tac att ggc egg aag atg gat gtg get gac ttc gag tgg gtg atg 307 
Glu Tyr He Gly Arg Lys Met Asp Val Ala Asp Phe Glu Trp Val Met 

30 35 40 

tgg ttc acc tec ttt cgc aac gtc ate ate ttt gee etc tec gga cat 355 

144 



35 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/1B00/01938 





Trp 


Phe 


Thr 


Ser 


Phe 


Arg 


Asn 


val 


He 


He 


Phe 


Ala 


Leu 


Ser 


Gly 


His 








45 










50 










55 














gtg 


ctg 


ttt 


get 


aaa 


etc 


tgc 


a eg 


atg 


gtt 


gcc 


cca 


aag 


etc 


cgc 


tec 


403 




Val 


Leu 


Phe 


Ala 


Lys 


Leu 


Cys 


Thr 


Met 


Val 


Ala 


Pro 


Lys 


Leu 


Arg 


Ser 




5 


60 










65 










70 










75 






tgg 


atg 


tat 


get 


gtg 


tac 


ggg 


gcc 


ttg 


get 


gtg 


atg 


ggc 


aca 


atg 


ggc 


451 




Trp 


Met 


Tyr 


Ala 


Val 


Tyr 


Gly 


Ala 


Leu 


Ala 


Val 


Met 


Gly 


Thr 


Met 


Gly 














80 










85 










90 








cct 


tgg 


tac 


ctg 


ctg 


ctg 


ctg 


ctt 


ggt 


cac 


tgt 


gtg 


ggc 


etc 


tat 


gtg 


499 


10 


Pro 


Trp 


Tyr 


Leu 


Leu 


Leu 


Leu 


Leu 


Gly His 


Cys 


Val 


Gly 


Leu 


Tyr 


Val 












95 










100 










105 










gcc 


teg 


ctt 


ttg 


ggc 


cag 


ccc 


tgg 


etc 


tgt 


ctt 


ggc 


ctt 


ggc 


ttg 


gcc 


547 




Ala 


Ser 


Leu 


Leu 


Gly Gin 


Pro 


Trp 


Leu 


Cys 


Leu 


Gly 


Leu 


Gly 


Leu 


Ala 










110 










115 










120 










15 


age 


ctg 


gcc 


tec 


ttc 


aag 


atg 


gac 


ccc 


eta 


ate 


tct 


tgg 


cag 


age 


ggg 


595 




Ser 


Leu 


Ala 


Ser 


Phe 


Lys 


Met 


Asp 


Pro 


Leu 


He 


Ser 


Trp 


Gin 


Ser 


Gly 








125 










130 










135 














ttt 


gta 


aca 


ggc 


act 


ttt 


gat 


ctt 


caa 


gag 


gtg 


ctg 


ttt 


cat 


ggg 


ggc 


643 




Phe 


Val 


Thr 


Gly Thr 


Phe 


Asp 


Leu 


Gin 


Glu 


Val 


Leu 


Phe 


His 


Gly Gly 




20 


140 










145 










150 










155 






age 


age 


ttc 


aca 


QtQ 


Ctq 


cgt 


tgc 


ace 


age 


ttt 


gea 


ctg 


gag 


age 


tgt 


691 




Ser 


Ser 


Phe 


Thr 


Val 


Leu 


Arg 


Cys 


Thr 


Ser 


Phe 


Ala 


Leu 


Glu 


Ser 


Cys 














160 










165 










170 








gcc 


cac 


cct 


gac 


cgc 


cac 


tac 


tec 


tta 


get 


gac 


ctg 


etc 


aag 


tac 


age 


739 


25 


Ala 


His 


Pro 


Asp 


Aro 


His 


Tyr 


Ser 


Leu 


Ala 


Asp 


Leu 


Leu 


Lys 


Tyr 


Ser 












175 










180 










185 










ttc 


tac 


ctg 


ccc 


ttc 


ttc 


ttc 


ttc 


ggg 


ccc 


ate 


atg 


ace 


ttt 


gat 


cgc 


787 




Phe 


Tyr 


Leu 


Pro 


Phe 


Phe 


Phe 


Phe 


Gly 


Pro 


He 


Met 


Thr 


Phe 


Asp 


Arg 










190 










195 










200 










30 


ttc 


cat 


get 


cag 


gtg 


aac 


cag 


gtg 


gag 


cca 


gtg 


aga 


cgc 


gag 


ggt 


gag 


835 




Phe 


His 


Ala 


Gin 


Val 


Ser 


Gin 


Val 


Glu 


Pro 


Val 


Arg 


Arg 


Glu 


Gly 


Glu 








205 










210 










215 














ctg 


tgg 


cac 


ate 


cga 


gcc 


cag 


gea 


ggc 


eta 


age 


gtg 


gtg 


gcc 


ate 


atg 


883 




Leu 


Trp 


His 


He 


Arg 


Ala 


Gin 


Ala 


Gly 


Leu 


Ser 


Val 


val 


Ala 


He 


Met 




35 


220 










225 










230 










235 






gcc 


gtc 


gac 


ate 


ttc 


ttt 


cac 


ttc 


ttc 


tac 


ate 


etc 


act 


ate 


ccc 


age 


931 




Ala 


Val 


Asp 


He 


Phe 


Phe 


His 


Phe 


Phe 


Tyr 


He 


Leu 


Thr 


He 


Pro 


Ser 














240 










245 










250 








gac 


etc 


aag 


ttc 


gcc 


aac 


cgc 


etc 


cca 


gac 


att 


gcc 


etc 


get 


ggc 


eta 


979 


40 


Asp 


Leu 


Lys 


Phe 


Ala 


Asn 


Arg 


Leu 


Pro 


Asp 


He 


Ala 


Leu 


Ala 


Gly 


Leu 












255 










260 










265 










gcc 


tat 


tea 


aac 


ctg 


gtg 


tat 


gac 


tgg 


gtg 


aag 


gcg 


gcc 


gtc 


etc 


ttt 


1027 




Ala 


Tyr 


Ser 


Asn 


Leu 


Val 


Tyr 


Asp 


Trp 


Val 


Lys 


Ala 


Ala 


val 


Leu 


Phe 










270 










275 










280 










45 


ggt 


gtt 


gtc 


aac 


act 


gtg 


gea 


tgc 


etc 


gac 


cac 


ctg 


gac 


cca 


ccc 


cag 


1075 




Gly 


Val 


val 


Asn 


Thr 


Val 


Ala 


Cys 


Leu 


Asp 


His 


Leu 


Asp 


Pro 


Pro 


Gin 








285 










290 










295 














cct 


ccc 


aag 


tgc 


ate 


ace 


gea 


etc 


tac 


gtc 


ttt 


gcg 


gaa 


acg 


cac 


ttt 


1123 




Pro 


Pro 


Lys 


Cys 


He 


Thr 


Ala 


Leu 


Tyr 


Val 


Phe 


Ala 


Glu 


Thr 


His 


Phe 




50 


300 










305 










310 










315 






gac 


cgt 


ggc 


ate 


aac 


gac 


tgg 


ctt 


tgc 


aaa 


tat 


gtg 


tat 


aac 


cac 


att 


1171 




Asp 


Arg 


Gly 


lie 


Asn 


Asp 


Trp 


Leu 


Cys 


Lys 


Tyr 


Val 


Tyr 


Asn 


His 


He 














320 










325 










330 








ggt 


ggg 


gag 


cat 


tec 


get 


gtg 


ate 


cca 


gag 


ctg 


gea 


gcc 


aca 


gtg 


gcc 


1219 


55 


Gly Gly Glu 


His 


Ser 


Ala 


Val 


He 


Pro 


Glu 


Leu 


Ala 


Ala 


Thr 


Val 


Ala 












335 










340 










345 










aca 


ttt 


gcc 


ate 


ace 


aca 


ctg 


tgg 


ctt 


ggg 


cct 


tgt 


gac 


att 


gtc 


tac 


1267 




Thr 


Phe 


Ala 


He 


Thr 


Thr 


Leu 


Trp 


Leu Gly 


Pro 


Cys 


Asp 


He 


Val 


Tyr 










350 










355 










360 










60 


ctg 


tgg 


tea 


ttc 


ctt 


aac 


tgc 


ttt 


ggc 


etc 


aac 


ttt 


gag 


etc 


tgg 


atg 


1315 




Leu 


Trp 


Ser 


Phe 


Leu 


Asn 


Cys 


Phe 


Gly 


Leu 


Asn 


Phe 


Glu 


Leu 


Trp 


Met 








365 










370 










375 














caa 


aaa 


ctg 


gea 


gag 


tgg 


ggg 


ccc 


eta 


gea 


cga 


att 


gag 


gcc 


tct 


ctg 


1363 
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Gin Lys Leu Ala Glu Trp Gly Pro Leu Ala Arg lie Glu Ala Ser Leu 
380 385 390 395 

tea gtg cag atg tec cgt agg gtc egg gec ctg ttt gga gee atg aac 1411 
Ser Val Gin Met Ser Arg Arg Val Arg Ala Leu Phe Gly Ala Met Asn 
5 400 405 410 

ttc tgg gec ate ate atg tac aac ctt gtg age ctg aac age etc aaa 1459 
Phe Trp Ala lie lie Met Tyr Asn Leu Val Ser Leu Asn Ser Leu Lys 

415 420 425 

ttc aca gag ctg gtt gec egg cgc ctg eta etc aca ggg ttc ccc cag 1507 

10 Phe Thr Glu Leu Val Ala Arg Arg Leu Leu Leu Thr Gly Phe Pro Gin 
430 435 440 

acc acg ctg tec ate ctg ttt gtc acc tac tgt ggc gtc cag ctg gta 1555 
Thr Thr Leu Ser lie Leu Phe Val Thr Tyr Cys Gly Val Gin Leu Val 
445 450 455 

15 aag gag cgt gag cga acc ttg gca ctg gag gag gag cag aag cag gac 1603 
Lys Glu Arg Glu Arg Thr Leu Ala Leu Glu Glu Glu Gin Lys Gin Asp 
460 465 470 475 

aaa gag aag ccg gag taggagggag egggtagagg gatgggctct gctcagctat 1658 
Lys Glu Lys Pro Glu 

20 480 

tettgggeca gatggggect gaccgataga ataaaagact tttctacaac aaaaaaaaaa 1718 
aaaaaaa 1725 

<210> 139 
25 <211> 1474 
<212> DNA 

<213> Homo sapiens 

<220> 
30 <221> CDS 

<222> 16. .471 

<220> 

<221> sig_peptide 
35 <222> 16. .93 

<223> Von Heijne matrix 

score 5.809301698725 

seq FCVCVIAIGWQA/LI 

40 <400> 139 

tacacgtttt cgtta atg gtg acc ttc cct gat gtg cct ctg ggc ate ttc 51 
Met Val Thr Phe Pro Asp Val Pro Leu Gly lie Phe 

-25 -20 -15 

ttg ttc tgt gtg tgt gtg ate gec ate ggg gtc gtg cag gca ctg att 99 

45 Leu Phe Cys Val Cys Val lie Ala lie Gly Val Val Gin Ala Leu He 

-10 -5 1 

gta ggg tac gca ttc cac ttc ccg cac ctg ctg age ccg cag ate cag 147 

Val Gly Tyr Ala Phe His Phe Pro His Leu Leu Ser Pro Gin He Gin 

5 10 15 

50 cgc tct gec cac agg get ctg tac cga cga cac gtc ctg ggc ate gtc 195 

Arg Ser Ala His Arg Ala Leu Tyr Arg Arg His Val Leu Gly He Val 

20 25 30 

etc caa ggc ccg gec ctg tgc ttt gca gcg gec ate ttc tct etc ttc 243 

Leu Gin Gly Pro Ala Leu Cys Phe Ala Ala Ala He Phe Ser Leu Phe 

55 35 40 45 50 

ttt gtc ccc ttg tct tac ctg ctg atg gtg act gtc ate etc etc ccc 291 

Phe Val Pro Leu Ser Tyr Leu Leu Met Val Thr Val He Leu Leu Pro 

55 60 65 

tat gtc age aag gtc acc ggc tgg tgc aga gac agg etc ctg ggc cac 339 

60 Tyr Val Ser Lys Val Thr Gly Trp Cys Arg Asp Arg Leu Leu Gly His 

70 75 80 

^99 9 a 9 ccc tc 9 9 ct cac cca 9^9 9 aa 9 tc ttc tc 9 tfct 9 ac ctc cac 387 
Arg Glu Pro Ser Ala His Pro Val Glu Val Phe Ser Phe Asp Leu His 
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10 



15 



20 



25 



30 



85 90 
gag cca etc age aag gag cgc gtg gaa gee ttc 
Glu Pro Leu Ser Lys Glu Arg Val Glu Ala Phe 

100 105 
gee ate gtg gee acg ctt etc ate ctg gac ate 
Ala lie Val Ala Thr Leu Leu He Leu Asp He 
115 120 125 

gcgtcacctg ccccagctat caggtggcca atgtgtcttg 
ggaaacccca gaaaggcaca ggggtcttgg ctccaccctc 
gtgtgaggtc agggcagccc ccacttcagg gaggacaacc 
cccagcggcc cctcccttcc cagaggctcc caccccaagc 
agggtgaggt cagcaccagc agccaactgc tctcctcact 
gccatgggta tccccctgcc ccaggcctca cccctgcccc 
cctagtccct cccattccct ccggctccct cccagtgccc 
tctgctccct ttggctggct gttgettect tccagcgtct 
gcctcttcgt ctgttagagc gcgcgtctcg tetcagtegt 
ggtttttttt tttttttttt tttgagacag tcctgctgtg 
tggctcaagc tcagctcact gcaacctccg cctcccaggt 
agcctcccaa gtagttggga ttacaagcac ccaccaccat 
ttttaataga gatgaggttt caccaagttg gecaggctgg 
gtgatctgee cacctcggcc tcccaaagtg ctgggattac 
ggccatcgta atgtttgaat ttgctttttt acatcttcca 
ccctcgtcat agttcagcac tgtgaccacc ttggggttag 
gtacttgata ttctccaaaa aaaaaaaaaa aaa 



95 

age gac gga gtc tac 
Ser Asp Gly Val Tyr 
110 

tgg tgaggacccc 
Trp 



agtccctggc 
ctctggatgc 
ttcccggcgg 
acagecgagg 
cctctcagag 
aacaccagcc 
cccatcgctt 
gctcctccgc 
cacgtttttg 
tcgcccaggc 
tcaagcaatt 
gcccagctaa 
tcttgaactc 
aggtgtaagc 
tccttttgga 
acactatggt 



gtctcatcct 
ctagagtttt 
cccctccctt 
atggggtgcc 
gggctcagca 
cctcctagtc 
cgcagcccct 
ggcctcatct 
gtttttgtgg 
tggagtatag 
ctcctgcctc 
etttttgeat 
ctgacctcag 
caccgtgccc 
gtgtcttgtt 
tttatatcct 



<210> 
<211> 
<212> 



140 
653 
DNA 



<213> Homo sapiens 

<220> 
<221> CDS 
<222> 222 . .374 



435 



481 



541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1474 



35 <220> 

<22l> sig_peptide 

<222> 222 . .299 

<223> Von Heijne matrix 

score 4.28353322771141 
40 seq ILFKFSLCPYAAA/LS 



<400> 140 

taataatgtt gttaaattat tgccttctca tctgcgtgtc 

gattgtcagt ttgttcaagc tctttttagt tgttgctcct 

45 ttgtacacgg tagttattga gttgagtaac atagtttgtt 

gcttgaagac ttggcttaac ctagtagata ataggaaaga 



50 



55 



60 



gat gaa aga gee cct etc tta ttc ate ctt ttt 
Asp Glu Arg Ala Pro Leu Leu Phe He Leu Phe 

-20 -15 
cca tat gca gca get etc age aaa cct ata ttt 
Pro Tyr Ala Ala Ala Leu Ser Lys Pro He Phe 
-5 1 5 

atg act aaa gaa ate ctg gec agg cac ggt ggc 
Met Thr Lys Glu He Leu Ala Arg His Gly Gly 

15 20 
taatcccagc actttgggag gecgaggegg gtggattacg 
tcctggctaa catggcgaaa ccccatctct acgaaaaata 
gcatcatggc gggcgcctgt agtcttagct actcaggagg 
gaacccggga ggeggagett geagtgagee gagattgege 
caacagagca agactccgtc tcaaaaaaaa aaaaaaaaa 



tcttatgttc tgcttaaaga 
ccagtgccta gctttgagct 
ctgagtcatt tgttccacat 
a atg gaa atg etc ttt 
Met Glu Met Leu Phe 
-25 

aaa ttt tct ttg tgc 
Lys Phe Ser Leu Cys 
-10 

ggc agt gtg gee tgt 
Gly Ser Val Ala Cys 
10 

tea cgc ctg 
Ser Arg Leu 
25 

aggtcaggag attgagacca 
caaaaaaaaa aattagcegg 
ctgaggcagg agaatggcgt 
cactgcactc cagcctgggg 



60 
120 
180 
236 



284 



332 



374 



434 
494 
554 
614 
653 
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<210> 141 

<211> 1490 ~ 
<212> DNA 

<213> Homo sapiens 

5 

<220> 
<221> CDS 
<222> 59. .274 

10 <220> 

<221> sig_j>eptide 

<222> 59. .127 

<223> Von Heijne matrix 

score 7.37647149292058 
15 seq LGLCSLLVGEAEA/PS 



<400> 141 

agacagaggc agggcttgcg acggaagtgg cctctctgct tctgcagggc tggggaag 58 

atg ctg cgt cca gcg tta ccg tgg ctg tac ctt ggc etc tgc age etc 106 

20 Met Leu Arg Pro Ala Leu Pro Trp Leu Tyr Leu Gly Leu Cys Ser Leu 
-20 -15 -10 

ctg gtg ggg gag gca gag gee ccg age ccc gtg gat ccg ctg gag egg 154 
Leu Val Gly Glu Ala Glu Ala Pro Ser Pro Val Asp Pro Leu Glu Arg 
-5 15 

25 age egg ccg tac gcg gtg ctg cga ggg cag aac ctg gtg ttg atg gga 202 
Ser Arg Pro Tyr Ala Val Leu Arg Gly Gin Asn Leu Val Leu Met Gly 
10 15 20 25 

ace att ttc age ate ctg ctg gtg act gtc ate ctt atg gca ttt tgt 250 
Thr lie Phe Ser lie Leu Leu Val Thr Val lie Leu Met Ala Phe Cys 

30 30 35 40 

gtc tac aag ccc att egg cgt egg tgacagccag acaagttctt caatgagtat 304 
Val Tyr Lys Pro lie Arg Arg Arg 
45 

ttgggaatag gataagttgt gttgeacaca ggccagtgga gaagttggaa ccaaaacttt 364 

35 cctacttgga aatgaccttt ggtctggaca gttggtaaat gctaaatgaa ttagaagaaa 424 

acatgtacta gacattattt tttcctaaca ctgtagcgca aataattggc ccctgagtcc 484 

gcttctcagt gtttctgact gtacttgtta aaagtaagac ctgaaagctc caaaggtcag 544 

tgtaaagatg gagtgttcat gagaaagaaa acatggtaac cttgtgagtg cctgtaagaa 604 

ccacactgta aagaactcat cattaatget tgaaaatgtt attaagaagg agacttacca 664 

40 tgcagacatt ccctatttaa gaaccatttg gttacagtgg gttaagaatc acagattttt 724 

ttttttaatc tcacctgagt tagectagaa tgcgctggtt gcaaagtggt gtcagctgtg 784 

gggatcttgg gccctcgttc ctcacctgca tcctgccctg cactcaggtg ctccccctga 844 

agtcagggtc acatcaggta gacctgttac tatatgeace tttggcctgg aatgctctga 904 

agttggactg gaaatgttac taggttggcc tgttacaaaa aggaccccat ectgettaaa 964 

45 cacattgatc tcccttgccc tgcatttgag tctttctagc ccacggtctg aaacttgagg 1024 

cagctttcca gatttggaat gtaaaaggct cagtgggcac tctgttcatc cctgggtggg 1084 

gagggeccag ccaacagaag tgcatgtcca ctgtgcgggc cagtgtgtgt ttacacaaat 1144 

ttcatctcag ctttgaaaat getgetatta gtttccactg ttggtgaact ggattttttc 1204 

ctcctattga aatgatactt tcatacttat aaagctgtcg tcaatattta tttcaaggtg 1264 

50 ctagatttaa ttttgttatt aaattgaaat gcttatcttg tgttcaagca cagcactgat 1324 

tttaacaacc tgcatttaat gtgaagtaac cgaagtagga tactgtaact gtgtaaggat 1384 

tttgtttgta atcttgtaac attgaaccat tgaaatgttc agttctttgc ttttgagcaa 1444 

aacgtcaatt aaaactaaag taaaatctta aaaaaaaaaa aaaaaa 1490 

55 <210> 142 
<211> 661 
<212> DNA 

<213> Homo sapiens 

60 <220> 

<221> CDS 
<222> 158 . .442 
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<220> 

<221> sig_peptide 
<222> 158 . .301 
<223> Von Heijne matrix 
5 score 7.53908709538105 

seq FVILLLFIFTWS/LV 



<400> 142 

aaaaacagac gataccatcg cttcagcagc atcctctcag acaagagcca ctatttctga 60 
10 ttcagatcac ctgtcatcga agtttaaaga aggggaaaca ggagacagaa atacactgaa 120 
ccaaaaagat tcaaaagagc aagtggaatc tctaaga atg get tec age cac tgg 175 

Met Ala Ser Ser His Trp 
-45 

aat gaa acc act ace tct gtt tat cag tac ctt ggt ttt caa gtt caa 223 
15 Asn Glu Thr Thr Thr Ser Val Tyr Gin Tyr Leu Gly Phe Gin Val Gin 
-40 -35 -30 

aaa att tac cct ttc cat gac aac tgg aac act gee tgc ttt gtc ate 271 
Lys lie Tyr Pro Phe His Asp Asn Trp Asn Thr Ala Cys Phe Val lie 
-25 -20 -15 

20 ctg ctt tta ttt ata ttt aca gtg gta tct tta gtg gtg ctg get ttc 319 
Leu Leu Leu Phe He Phe Thr Val Val Ser Leu Val Val Leu Ala Phe 
-10 -5 15 

ctt tat gaa gtg ctt gac tgc tgc tgc tgt gta aaa aac aaa acc gtg 367 
Leu Tyr Glu Val Leu Asp Cys Cys Cys Cys Val Lys Asn Lys Thr Val 

25 10 15 20 

aaa gac ttg aaa agt gaa ccc aac cct ctt aga agt atg atg gac aac 415 
Lys Asp Leu Lys Ser Glu Pro Asn Pro Leu Arg Ser Met Met Asp Asn 

25 3 0 35 

ate aga aaa cgt gaa act gaa gtg gtc taacactcta tagaagatga 462 

30 He Arg Lys Arg Glu Thr Glu Val Val 
40 45 
aeaaaatctc tgaaagcagc tcaacctctt ctgagaaaaa aaatatattc tgaggecaac 522 
tgttgctaca aaacaaattc tgactgaatg tttaaaacat ttctagtaga aggggaaaaa 582 
aaagttaaac atgcactgtt tgtgtgtata gecatttcat taaatataca gtaaaacttc 642 

35 ataaaaaaaa aaaaaaaaa 661 



<210> 143 

<211> 1789 

<212> DNA 

40 <213> Homo sapiens 



45 



<220> 
<221> CDS 
<222> 5 . . 454 



<220> 

<221> sig_peptide 
<222> 5 . . 64 

<223> Von Heijne matrix 
50 score 6.64507667657896 

seq LLPLLSLLVGAWL/KL 



<400> 143 

cctg atg gec egg cat ggg tta ccg ctg ctg ccc ctg ctg teg etc ctg 49 
55 Met Ala Arg His Gly Leu Pro Leu Leu Pro Leu Leu Ser Leu Leu 

-20 -15 -10 

gtc ggc gcg tgg etc aag eta gga aat gga cag get act age atg gtc 97 
Val Gly Ala Trp Leu Lys Leu Gly Asn Gly Gin Ala Thr Ser Met Val 
-5 1 5 10 

60 caa ctg cag ggt ggg aga ttc ctg atg gga aca aat tct cca gac age 145 
Gin Leu Gin Gly Gly Arg Phe Leu Met Gly Thr Asn Ser Pro Asp Ser 

15 20 25 

aga gat ggt gaa ggg cct gtg egg gag gcg aca gtg aaa ccc ttt gee 193 
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Arg Asp Gly 
30 

ate gac ata 
lie Asp lie 
5 45 

gag aaa aag 
Glu Lys Lys 
60 

ttt gag gac 
10 Phe Glu Asp 

atg aag gtc 
Met Lys Val 

15 cca ace tgt 
Pro Thr Cys 
110 

get tec atg 
Ala Ser Met 

20 125 

ggctctatga 
agcaggacat 
atcaccgggc 
tcggtttccg 

25 ggtgacaagg 
gcaattccaa 
ccctctgtgg 
gaaggggece 
tattattgac 

30 ttttttaagc 
gacattgttt 
aggagaatgc 
ttccactgtg 
ctgggagaac 

35 agtactccag 
ttgagacagg 
tagecttgaa 
tacaagtgtg 
ctctgttgcc 

40 ecaaagtget 
ccaaaaacat 
aataaaatct 



Glu Gly Pro Val Arg Glu Ala Thr Val Lys Pro Phe Ala 

35 40 
ttt cct gtc acc aac aaa gat ttc agg gat ttt gtc agg 
Phe Pro Val Thr Asn Lys Asp Phe Arg Asp Phe Val Arg 

50 55 
tat egg aca gaa get gag atg ttt gga tgg age ttt gtc 
Tyr Arg Thr Glu Ala Glu Met Phe Gly Trp Ser Phe Val 

65 70 75 

ttt gtc tct gat gag ctg aga aac aaa gec acc cag cca 
Phe Val Ser Asp Glu Leu Arg Asn Lys Ala Thr Gin Pro 

80 85 90 

aag ttt acc cat ggg gga act ggt tec age caa acc gca 
Lys Phe Thr His Gly Gly Thr Gly Ser Ser Gin Thr Ala 
95 100 105 

ggc agg gaa agt tec cca agg gag aca aag ctg agg atg 
Gly Arg Glu Ser Ser Pro Arg Glu Thr Lys Leu Arg Met 

115 120 
gag tct ccc cag tgaatgcttt ccccgcccag aacaactacg 
Glu Ser Pro Gin 
130 

cctcctgggg aacgtgtggg agtggacagc atcaccgtac caggctgetg 
gcgcgtcctc eggggggcat cctggatcga cacagctgat ggctctgcca 
ccgggtcacc accaggatgg gcaacactcc agattcagee tcagacaacc 
ctgtgctgca gacgcaggcc ggccgccagg ggagctgtaa geagcegggt 
agaaaagect tctagggtca ctgtcattcc ctggccatgt tgeaaacage 
gctcgagagc ttcagcctca ggaaagaact tccccttccc tgtctcccat 
caggcgcctc tcaccagggc aggagaggac tcagcctcct gtgttttgga 
aatgtgtgtt gacgatggct gggggecagg tgtttctgtt agaggecaag 
acaggattgc aaacacacaa acaattggaa cagagcactc tgaaaggeca 
attttaaaat ctattctctc cccctttctc cctggatgat tcaggaagct 
cctcaaggca gaattttcct ggttctgttt tctcagccag ttgctgtgga 
tttctttgtg gcctcatctg tggtttcgtg tccctctgaa ggaaactagt 
taacaggcag acatgtaact atttaaagca cagttcagtc ctaaaagggt 
cagatgatgt actaggtgaa geattgeatt gtgggaatca caaagcaaat 
aaagacaaat atcagaagct tcctattctt tttttttttt tttttttttt 
gtctttctct gttgeccagg etagagtgea ctggtgatca cggctcactc 
ttcctgggcc caagcaattc tcccacctca gcctcctgag tagctgggac 
caccaccatg cctggctaat tttttgaatt tttgtagtga tgggatctcg 
cagggtggtc tcgaactcct ggcctcaagc gatcctccca cctcgacctc 
gggattacag gtgtgagcca cctcgcctgg gcccccttct ccatatgcct 
gtccctggag agtagcctgc tcccacactg tcactggatg teatggggee 
ectgeaattg tgtatctcaa aaaaaaaaaa aaaaa 



241 



289 



337 



385 



433 



484 



544 
604 
664 
724 
784 
844 
904 
964 
1024 
1084 
1144 
1204 
1264 
1324 
1384 
1444 
1504 
1564 
1624 
1684 
1744 
1789 



<210> 144 

45 <211> 2006 

<212> DNA 

<213> Homo sapiens 

<220> 

50 <221> CDS 

<222> 241. .1302 



<400> 144 

tagtgccgga gccccgccag agcccgactt cagccccagc cagatcccgc gtcaaeggag 60 
55 geggaaegge ggaccccgta ccctggcagc ateggagcac cggcgggtga aggcaaggtc 120 
cctggactgg tcatatacct cttgtggccc tggcagaatc aagatgaggc cctgtcatgc 180 
ctccccagtg aggectacag tctgagcaga cagcatggcc tgccactggc agtgaacacc 24 0 
atg tct gca gga ggt ggc egg gee ttt get tgg caa gtg ttc ccc ccc 288 
Met Ser Ala Gly Gly Gly Arg Ala Phe Ala Trp Gin Val Phe Pro Pro 
60 1 5 10 15 

atg ccc act tgc egg gtc tat ggc aca gtg gca cac caa gat ggg cac 336 
Met Pro Thr Cys Arg Val Tyr Gly Thr Val Ala His Gin Asp Gly His 
20 25 30 
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ctg ctg gtg ttg ggg ggt tgt ggc egg get gga ctg ccc ctg gac act 384 
Leu Leu Val Leu Gly Gly Cys Gly Arg Ala Gly Leu Pro Leu Asp Thr 

35 40 45 

get gag aca ctg gac atg gee teg cac aca tgg ctg gca ctg gca ccc 432 
5 Ala Glu Thr Leu Asp Met Ala Ser His Thr Trp Leu Ala Leu Ala Pro 
50 55 60 

ctg ccc act gec egg get ggt gca get gcg gta gtt ctg ggc aag cag 480 
Leu Pro Thr Ala Arg Ala Gly Ala Ala Ala Val Val Leu Gly Lys Gin 
65 70 75 80 

10 gtg eta gtg gtg tgt ggt gtg gat gag gtc cag age ccg gta get get 52 8 

Val Leu Val Val Cys Gly Val Asp Glu Val Gin Ser Pro Val Ala Ala 

85 90 95 

gta gag gec ttc ctg atg gat gag ggc cgc tgg gag cgt egg gee ace 576 
Val Glu Ala Phe Leu Met Asp Glu Gly Arg Trp Glu Arg Arg Ala Thr 
15 100 105 110 

etc cct caa gca gee atg ggg gtt gca act gtg gag aga gat ggt atg 624 
Leu Pro Gin Ala Ala Met Gly Val Ala Thr Val Glu Arg Asp Gly Met 

115 120 125 

gtg tat get ctg ggg gga atg ggc cct gac acg gec ccc cag gee cag 672 
20 Val Tyr Ala Leu Gly Gly Met Gly Pro Asp Thr Ala Pro Gin Ala Gin 
130 135 140 

gta cgt gtg tat gac ccc cgt egg gac tgc tgg ctt teg eta ccc tec 720 
Val Arg Val Tyr Asp Pro Arg Arg Asp Cys Trp Leu Ser Leu Pro Ser 
145 150 155 160 

25 atg ccc aca ccc tgc tat ggg gee tec ace ttc ctg cac ggg aac aag 768 
Met Pro Thr Pro Cys Tyr Gly Ala Ser Thr Phe Leu His Gly Asn Lys 

165 170 175 

ate tat gtc ctg ggg ggc cgc cag ggc aag etc ccg gtg act get ttt 816 
lie Tyr Val Leu Gly Gly Arg Gin Gly Lys Leu Pro Val Thr Ala Phe 
30 180 185 190 

gaa gec ttt gat ctg gag gec cgt aca tgg ace egg cat cca age eta 864 
Glu Ala Phe Asp Leu Glu Ala Arg Thr Trp Thr Arg His Pro Ser Leu 

195 200 205 

ccc age cgt egg gec ttt get ggc tgc gec atg get gaa ggc age gtc 912 
35 Pro Ser Arg Arg Ala Phe Ala Gly Cys Ala Met Ala Glu Gly Ser Val 
210 215 220 

ttt age ctg ggt ggc ctg cag cag cct ggg ccc cac aac ttc tac tct 960 
Phe Ser Leu Gly Gly Leu Gin Gin Pro Gly Pro His Asn Phe Tyr Ser 
225 230 235 240 

40 cgc cca cac ttt gtc aac act gtg gag atg ttt gac ctg gag cat ggg 1008 
Arg Pro His Phe Val Asn Thr Val Glu Met Phe Asp Leu Glu His Gly 

245 250 255 

tec tgg ace aaa ttg ccc cgc age ctg cgc atg agg gat aag agg gca 1056 
Ser Trp Thr Lys Leu Pro Arg Ser Leu Arg Met Arg Asp Lys Arg Ala 
45 260 265 270 

gac ttt gtg gtt ggg tec ctt ggg ggc cac att gtg gee att ggg ggc 1104 
Asp Phe Val Val Gly Ser Leu Gly Gly His He Val Ala He Gly Gly 

275 280 285 

ctt gga aac cag cca tgt cct ttg ggc tct gtg gag age ttt age ctt 1152 
50 Leu Gly Asn Gin Pro Cys Pro Leu Gly Ser Val Glu Ser Phe Ser Leu 
290 295 300 

gca egg egg cgc tgg gag gca ttg cct gee atg ccc act gee cgc tgc 1200 
Ala Arg Arg Arg Trp Glu Ala Leu Pro Ala Met Pro Thr Ala Arg Cys 
305 310 315 320 

55 tec tgc tct agt ctg cag get ggg ccc egg ctg ttt gtt att ggg ggt 124 8 

Ser Cys Ser Ser Leu Gin Ala Gly Pro Arg Leu Phe Val He Gly Gly 

325 330 335 

gtg gee cag ggc ccc agt caa gee gtg gag gca ctg tgt ctg cgt gat 12 96 

Val Ala Gin Gly Pro Ser Gin Ala Val Glu Ala Leu Cys Leu Arg Asp 
60 340 345 350 

999 gtc tgaaggcttg gtgggagctg tccactggag cagctcattg ccagaggcag 13 52 

Gly Val 

ctatttctat ggctcctttt getgetgagg acactcactg tggctctgtg ggatgagaga 1412 
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ggcatggggg tgagcacttg aaacactgcc ttggggcctt gggttagggg agcctttgtc 1472 

tttagtgcag gacacacata tgcttacacc tacctttatc accattcgtt catgaatcat 1532 

gcctagctcc atccttgccc tgggacctac taggccttcc atccaactgg gaaatgggga 1592 

gaagcaaagc tggcctcatg ctcttcaggg tcagttccta tctggagttg accaggccta 1652 

5 ccccagttgc cattcctgaa aaatctcagc tgccaggctg cctttagggt ccctgcagac 1712 

ccaggagagt tgagagggtg ggggacacac acagaataga gaggatgtgg gaactgccag 1772 

agggccggag cgcaggagtt caagtggagg aatgctggct ttgagccctc tacactgctg 1832 

gttgtatgac cttggacaag tcacttcacc tctctgtgcc tcagcatcct catctataaa 1892 

tggggatctc tgaaaccttc ctaccctacc tacctcacag ggctgttgtg aggacccagg 1952 

10 gagtttggat gtggaagtaa aagtgctgct aaaaccgaaa aaaaaaaaaa aaaa 2006 

<210> 145 
<211> 1096 
<212> DNA 
15 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 15 . .635 

20 

<400> 145 

atccaaggcg caag atg gcg ctg ctt ttt gca cgt tct ttg cgc ttg tgc 50 
Met Ala Leu Leu Phe Ala Arg Ser Leu Arg Leu Cys 
15 10 
25 cgc tgg gga gcc aaa cga ttg gga gtt gcc tec aca gag gec cag aga 98 
Arg Trp Gly Ala Lys Arg Leu Gly Val Ala Ser Thr Glu Ala Gin Arg 

15 20 25 

ggc gtc agt ttc aaa ctg gaa gaa aaa acc gcc cac age age ctg gca 146 
Gly Val Ser Phe Lys Leu Glu Glu Lys Thr Ala His Ser Ser Leu Ala 
30 3 0 3 5 4 0 

etc ttc aga gat gat acg ggt gtc aaa tat ggc ttg gtg gga ttg gag 194 
Leu Phe Arg Asp Asp Thr Gly Val Lys Tyr Gly Leu Val Gly Leu Glu 
45 50 55 60 

ccc acc aag gtg gcc ttg aat gtg gag cgc ttc egg gag tgg gca gtg 242 
35 Pro Thr Lys Val Ala Leu Asn Val Glu Arg Phe Arg Glu Trp Ala Val 

65 70 75 

gtg ctg gca gac aca gcg gtc acc agt ggc aga cac tac tgg gaa gtg 290 
Val Leu Ala Asp Thr Ala Val Thr Ser Gly Arg His Tyr Trp Glu Val 
80 85 90 

40 aca gtg aag cgc tec cag cag ttc egg ata gga gtg gca gat gtg gac 338 
Thr Val Lys Arg Ser Gin Gin Phe Arg lie Gly Val Ala Asp Val Asp 

95 100 105 

atg tec egg gat age tgc att ggt gtt gat gat cgt tec tgg gtg ttc 3 86 

Met Ser Arg Asp Ser Cys He Gly Val Asp Asp Arg Ser Trp Val Phe 
45 110 115 120 

acc tat gcc cag cgc aag tgg tac acc atg ttg gcc aac gag aaa gcc 434 
Thr Tyr Ala Gin Arg Lys Trp Tyr Thr Met Leu Ala Asn Glu Lys Ala 
125 130 135 140 

cca gtt gag ggt att ggg cag cca gag aag gtg ggg ctg ttg ctg gag 4 82 

50 Pro Val Glu Gly He Gly Gin Pro Glu Lys Val Gly Leu Leu Leu Glu 

145 150 155 

tat gag gcc cag aag ctg age ctg gtg gat gtg age cag gtc tct gtg 530 
Tyr Glu Ala Gin Lys Leu Ser Leu Val Asp Val Ser Gin Val Ser Val 
160 165 170 

55 gtt cac acg eta cag aca gat ttc egg ggt cca gtg gtg cct gcc ttt 578 
Val His Thr Leu Gin Thr Asp Phe Arg Gly Pro Val Val Pro Ala Phe 

175 180 185 

get etc tgg gat ggg gag ctg ctg acc cat tea ggg ctt gag gtg ccc 626 
Ala Leu Trp Asp Gly Glu Leu Leu Thr His Ser Gly Leu Glu Val Pro 
60 190 195 200 

gag ggc etc tagtatgtcc attactggag tccctaatca cgcctttggc 675 

Glu Gly Leu 

205 

152 
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cagcctcctt 
cacaattcag 
gtgaaagcta 
ttcctaggct 
5 tttgcccagg 
gaatttatta 
ggctctacca 



ttgaaagtgt 
tgttgggtcc 
ggcatacagc 
accatgggtg 
cctttctcag 
atcaccatga 
gatggctgaa 



ccgaagcctt 
tctgtgcaat 
caaaccctcc 
tatcttcctt 
actgtattcc 
tacctctccc 
gagtaaatcc 



tttactttgc 
atcatgatca 
ttttccccac 
gacctgcttc 
atcctggggt 
tccctttgtc 
tttctacctc 



ctcaagcaac 
tcttcctcat 
ccaccaacac 
cttcagtccc 
cttatcattc 
cacatgtaac 
tggcaaaaaa 



ctctagctcc 
cccctacctt 
tactgccaat 
tctgcctccc 
agctttgttt 
ttgttcttgg 
aaaaaaaaaa 



735 
795 
855 
915 
975 
1035 
1095 
1096 



10 



15 



<210> 146 
<211> 1666 
<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 109. 



.738 



20 



25 



30 



35 



<400> 146 

cccagcgttc ctcctccggc cccaggtcac cgccagcacg 
gagtccacgc agctccccag gcccttcacc agcacagcag 



50 



age gtg gag 
Ser Val Glu 
5 

ttc ttc cga 
Phe Phe Arg 
20 

ctg ctg ctg 
Leu Leu Leu 



ggt 
Gly 

gac 
Asp 



att 
40 He 

ttg 
Leu 
100 
45 tac 
Tyr 



gtg 
Val 

gec 
Ala 



atg 
55 Met 

gtg 
Val 
180 
60 gag 
Glu 



aca ctg 
Thr Leu 

ctg cca 
Leu Pro 

70 
ggg gag 
Gly Glu 
85 

gag ctg 
Glu Leu 

tec ctg 
Ser Leu 

cca gtg 
Pro Val 

agt gtg 
Ser Val 
150 
ggt cag 
Gly Gin 
165 

ctg ate 
Leu He 

gag tgg 
Glu Trp 



cag cgc 
Gin Arg 

gag gee 
Glu Ala 

cat ggt 
His Gly 

40 
cac agg 
His Arg 
55 

ggt ctg 
Gly Leu 

ctg gee 
Leu Ala 

ggc ccc 
Gly Pro 



gag ggc acc 
Glu Gly Thr 
10 

ctg ccc ggc 
Leu Pro Gly 
25 

att cgc ttc 
He Arg Phe 

ctg gec cag 
Leu Ala Gin 



ate cag gtg 
He Gin Val 



ccc 
Pro 

gec 
Ala 
135 
aag 
Lys 



ttc 
Phe 
120 
ccc 
Pro 

act 
Thr 



ggg 

Gly 

cct 
Pro 

ccg 
Pro 
105 
etc 
Leu 



cac tec 
His Ser 

75 
ggc age 
Gly Ser 
90 

gtt gtg 
Val Val 

acg gec 
Thr Ala 



agt 
Ser 

tec 
Ser 

get 

Ala 

60 

aag 

Lys 



ggg 

Gly 

tec 

Ser 

45 

ggc 

Gly 



cag 

Gin 

30 

gag 

Glu 

tac 
Tyr 



gaa gca 
Glu Ala 



ttc ctg gcg 
Phe Leu Ala 



ate tgc act 
He Cys Thr 



tgaagcccag 



acc age 
Thr Ser 

atg aag 
Met Lys 

cat aca 
His Thr 
200 
cactgetg 



cca get 
Pro Ala 

ttt gag 
Phe Glu 
170 

ggg gcg 

Gly Ala 
185 

ggg ctg 
Gly Leu 



ctg 
Leu 
155 
cac 
His 



ate agt 
He Ser 

cct ggc 
Pro Gly 
125 
gac aaa 
Asp Lys 
140 

att gta 
He Val 



cca 
Pro 
110 
tec 
Ser 

ate 
He 

tat 
Tyr 



ctg aag cag 
Leu Lys Gin 



ggg cac ccc tgt 
Gly His Pro Cys 
190 

ctg gac ttc ctg 
Leu Asp Phe Leu 
205 

ca gggggtgggc tgcctgcctg 



cgcctgcttc ccgtctgcgc 
cagcaggc atg gca gca 

Met Ala Ala 

1 

cag ggc cag gee etc 
Gin Gly Gin Ala Leu 
15 

get cgc ttc tct gta 
Ala Arg Phe Ser Val 
35 

acc tgg cag aac ctg 
Thr Trp Gin Asn Leu 
50 

egg get gtg gec att 
Arg Ala Val Ala He 
65 

gca gee cct gee cct 
Ala Ala Pro Ala Pro 
80 

get gtg gtg gat gec 
Ala Val Val Asp Ala 
95 

tea ctg agt ggc atg 
Ser Leu Ser Gly Met 
115 

cag etc ccg ggc ttt 
Gin Leu Pro Gly Phe 
130 

aat get gec aac tat 
Asn Ala Ala Asn Tyr 
145 

gga gac cag gac ccc 
Gly Asp Gin Asp Pro 
160 

ctg ccc aac cac egg 
Leu Pro Asn His Arg 
175 

tac ctg gac aaa cca 
Tyr Leu Asp Lys Pro 
195 

ca g ggg etc cag 

Gin Gly Leu Gin 
210 

ctctgagctc tctcttgcac 



60 
117 



165 



213 



261 



309 



357 



405 



453 



501 



549 



597 



645 



693 



738 



798 



153 
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10 



15 



gctctctctt 
gtctgggttc 
agactgaggg 
gagggtaatc 
cttcccctgc 
aggccagccc 
cccacatgca 
agggcaactg 
tgcatataca 
taaccatgct 
ggcccttttg 
gctggcccca 
ttacccttcc 
gcttgtttcc 
gtgatgcaat 



ctctcccagg 
ttgtcttttg 
ggtaaaatca 
cattacatga 
tctgcccagc 
ttaccccaac 
cgcttacatg 
cataggtaca 
cacatgcata 
aacctcactg 
caaggcttag 
gcccagaagt 
ttctgggtgc 
tgctctgagg 
taaaaaaaaa 



ctctggctca 
tggtctgttt 
agagaaaaaa 
gcttctcctg 
ctttccctcc 
acccacttcc 
tttagagcca 
tctaactctg 
catgagcctc 
gctgggaagg 
ggtgtggcca 
gacccccaga 
tctacacctc 
cttgtggggt 
aaaaaaaa 



tgcacatgca 
gtcttttcta 
ctctcaggaa 
ttcttccact 
cacccactcc 
ccacctcctt 
tccttgtttc 
gactggcatg 
cacacaagca 
tggggacccc 
gccctgaaag 
aagggagggc 
aggttaccag 
gggagccaga 



acaggtgcgt 
cctctttctc 
tcaaggaaca 
ttcctgcctg 
tacttctgca 
aggccccaga 
caaatatgac 
cacattgtca 
cttgcacaca 
atgggccagc 
ctacttggac 
caccgctttg 
gcctgaggca 
gtggaggtcg 



ctgtctatat 
ttgcagtgat 
taatcctgtg 
gctttcactc 
aatgccctga 
tacatacatg 
ccttcgcttg 
tgtgcagctt 
tgtggactcc 
ccttgcagga 
acaggtttca 
ccccctgctt 
tctcagccaa 
gtgaaataaa 



858 
918 
978 
1038 
1098 
1158 
1218 
1278 
1338 
1398 
1458 
1518 
1578 
1638 
1666 



<210> 147 
<211> 1687 
<212> DNA 
20 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 21. . 



1145 



25 

<400> 147 

gtttctccgg acttcgagcc atg gcg gtg acg gaa gcg age ctg ttg cgc cag 53 

Met Ala Val Thr Glu Ala Ser Leu Leu Arg Gin 
15 10 

30 tgc ccc ctg ctt ctg ccc cag aac egg teg aaa ace gtg tat gag gga 101 

Cys Pro Leu Leu Leu Pro Gin Asn Arg Ser Lys Thr Val Tyr Glu Gly 

15 20 25 

ttc ate teg get cag gga aga gac ttc cac ctt agg ata gtg ttg cct 149 

Phe lie Ser Ala Gin Gly Arg Asp Phe His Leu Arg lie Val Leu Pro 

35 30 35 40 

gaa gat tta caa ctg aag aat gca aga tta tta tgt att tgg cag ctg 197 

Glu Asp Leu Gin Leu Lys Asn Ala Arg Leu Leu Cys lie Trp Gin Leu 

45 50 55 

aga aca ata ctt agt gga tac cat cga ata gta caa cag aga atg cag 245 

40 Arg Thr lie Leu Ser Gly Tyr His Arg lie Val Gin Gin Arg Met Gin 
60 65 70 75 

cac tct cct gat eta atg age ttt atg atg gag ttg aag atg ctt ttg 293 

His Ser Pro Asp Leu Met Ser Phe Met Met Glu Leu Lys Met Leu Leu 

80 85 90 

45 gaa gtt gee tta aag aat aga caa gag ctg tat gca eta cct cct cct 341 

Glu Val Ala Leu Lys Asn Arg Gin Glu Leu Tyr Ala Leu Pro Pro Pro 

95 100 105 

ccc cag ttc tac tea age ctt att gaa gag ata gga act ctt ggt tgg 389 

Pro Gin Phe Tyr Ser Ser Leu lie Glu Glu lie Gly Thr Leu Gly Trp 

50 110 115 120 

gat aaa ctt gtg tat gcg gat ace tgc ttc agt ace ate aag tta aaa 437 

Asp Lys Leu Val Tyr Ala Asp Thr Cys Phe Ser Thr lie Lys Leu Lys 

125 130 135 

gca gaa gat get tct ggt aga gag cat tta ate act etc aag ttg aag 485 

55 Ala Glu Asp Ala Ser Gly Arg Glu His Leu lie Thr Leu Lys Leu Lys 
140 145 150 155 

gca aag tat cct gca gaa tea cca gat tat ttt gtg gat ttt cct gtt 533 

Ala Lys Tyr Pro Ala Glu Ser Pro Asp Tyr Phe Val Asp Phe Pro Val 

160 165 170 

60 cca ttt tgt gee tec tgg aca cct cag age tec tta ata age att tat 581 

Pro Phe Cys Ala Ser Trp Thr Pro Gin Ser Ser Leu lie Ser lie Tyr 

175 180 185 

agt cag ttt ttg gca gca ata gaa tea eta aag gca ttc tgg gat gtt 629 



154 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



Ser Gin Phe Leu Ala Ala lie Glu Ser Leu Lys 

190 195 
atg gat gaa ate gat gag aag acc tgg gta ctt 
Met Asp Glu lie Asp Glu Lys Thr Trp Val Leu 

205 210 
cca egg agt gca aca gca cgc aga att gca tta 
Pro Arg Ser Ala Thr Ala Arg Arg lie Ala Leu 
220 225 230 

ata aat ata gag gta gac ccc agg cat cct act 
lie Asn lie Glu Val Asp Pro Arg His Pro Thr 

240 245 
ttc ttt ctt gga get gac cat gtg gta aaa ccc 
Phe Phe Leu Gly Ala Asp His Val Val Lys Pro 

255 260 
age agg aac ata cat ttg tgg gat cca gaa aat 
Ser Arg Asn lie His Leu Trp Asp Pro Glu Asn 

270 275 
ttg aaa gat gtt tta gaa att gat ttt cca get 
Leu Lys Asp Val Leu Glu lie Asp Phe Pro Ala 

285 290 
aaa tct gat ttt act atg gat tgt gga att tgt 
Lys Ser Asp Phe Thr Met Asp Cys Gly lie Cys 
300 305 310 

gac ggt acc att cct gat caa gtg tgt gat aat 
Asp Gly Thr lie Pro Asp Gin Val Cys Asp Asn 

320 325 
cct ttc cat caa ata tgc tta tat gag tgg ctg 
Pro Phe His Gin lie Cys Leu Tyr Glu Trp Leu 

335 340 
agt aga cag agt ttt aac ate ata ttt ggt gaa 
Ser Arg Gin Ser Phe Asn lie lie Phe Gly Glu 

350 355 
aag cca att acc tta aaa atg tct gga agg aaa 
Lys Pro lie Thr Leu Lys Met Ser Gly Arg Lys 

365 370 
atacaacatt teggtgaaga gctggaaact taaaaaatta 
atcttcagag aaaaaataaa gcaagraata ctaacatcaa 
gataataata aacatctgeg tttgtctctt cactaagagt 
aaagtccagt tgaactttct aagtctgtga tccccgtgct 
accaagatgg agatcttgac ttcttgaata tatctggact 
cataaaatga gtttgggaat tgtgtatagc tgattttttg 
tcaaaggttc ttgagactct tgatatttct gtcttctcct 
atacatatat agtttagttt gttagacgtg agttatccaa 
gtaagaatgc taaataaaat gttatacagg aaaaaaaaaa 

<210> 148 

<211> 1747 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 70. .1596 



Ala Phe Trp Asp Val 
200 

gag cca gaa aaa cct 
Glu Pro Glu Lys Pro 
215 

ggt aat aat gtt tec 
Gly Asn Asn Val Ser 
235 

atg ctt cct gag tgc 
Met Leu Pro Glu Cys 
250 

ctg gga att aag ctg 
Leu Gly lie Lys Leu 
265 

agt gtg tta caa aat 
Ser Val Leu Gin Asn 
280 

cgt get ate ctg gaa 
Arg Ala lie Leu Glu 
295 

tat get tat caa ctt 
Tyr Ala Tyr Gin Leu 
315 

tec cag tgt gga caa 
Ser Gin Cys Gly Gin 
330 

aga gga eta eta act 
Arg Gly Leu Leu Thr 
345 

tgt cca tat tgt agt 
Cys Pro Tyr Cys Ser 
360 

cac tgaaataaga 

His 

375 

tcaaaaggaa ttttggtatc 
aaggacaggt atgatgatgc 
aaactgggaa attgtaggee 
gactgtggaa gtgtatttat 
ggtaaaatct tgatgaggct 
tgggaaactg tttacttcat 
tgtgctttcc tatggaaaaa 
gtatttattt tgtgtagtgt 
aaaaatgega aa 



677 



725 



773 



821 



869 



917 



965 



1013 



1061 



1109 



1155 



1215 
1275 
1335 
1395 
1455 
1515 
1575 
1635 
1687 



55 <400> 148 

gttgggcggc eggtagctgt tgctgttggg ggaccccctc attcctgccg ctgccgtccc 60 
tgctgcctc atg gcg gee ate gga gtt cac ctg ggc tgc acc tea gee tgt 111 
Met Ala Ala He Gly Val His Leu Gly Cys Thr Ser Ala Cys 
1 5 10 

60 gtg gee gtc tat aag gat ggc egg get ggt gtg gtt gca aat gat gee 159 
Val Ala Val Tyr Lys Asp Gly Arg Ala Gly Val Val Ala Asn Asp Ala 
15 20 25 30 

ggt gac cga gtt act cca get gtt gtt get tac tea gaa aat gaa gag 207 

155 
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Gly Asp Arg 


Val 


Thr 


Pro 


Ala 


Val 


Val 


Ala 


Tyr 


Ser 


Glu 


Asn 


Glu 


Glu 














35 










40 










45 








att 


gtt 


gga 


ttg 


gca 


gca 


aaa 


caa 


agt 


aga 


ata 


aga 


aat 


att 


tea 


aat 


255 




He 


Val 


Gly 


Leu 


Ala 


Ala 


Lys 


Gin 


Ser 


Arg 


He 


Arg 


Asn 


He 


Ser 


Asn 




5 








50 










55 










60 










aca 


gta 


atg 


aaa 


gta 


aag 


cag 


ate 


ctg 


ggc 


aga 


age 


tec 


agt 


gat 


cca 


303 




Thr 


Val 


Met 
65 


Lys 


Val 


Lys 


Gin 


He 
70 


Leu 


Gly 


Arg 


Ser 


Ser 
75 


Ser 


Asp 


Pro 






caa 


get 


cag 


aaa 


tac 


ate 


gcg 


gaa 


agt 


aaa 


tgt 


tta 


gtc 


att 


gaa 


aaa 


351 


10 


Gin 


Ala 
80 


Gin 


Lys 


Tyr 


He 


Ala 
85 


Glu 


Ser 


Lys 


Cys 


Leu 
90 


Val 


lie 


Glu 


Lys 






aat 


999 


aaa 


tta 


cga 


tat 


gaa 


ata 


gat 


act 


gga 


gaa 


gaa 


aca 


aaa 


ttt 


399 




Asn 


Gly 


Lys 


Leu 


Arg 


Tyr 


Glu 


lie 


Asp 


Thr Gly 


Glu 


Glu 


Thr 


Lys 


Phe 






95 










100 










105 










110 




15 


gtt 


aac 


cca 


gaa 


gat 


gtt 


gee 


aga 


ctg 


ata 


ttt 


agt 


aaa 


atg 


aaa 


gaa 


447 




Val 


Asn 


Pro 


Glu 


Asp 
115 


Val 


Ala 


Arg 


Leu 


He 
120 


Phe 


Ser 


Lys 


Met 


Lys 
125 


Glu 






acg 


gca 


cat 


tct 


gta 


ttg 


ggc 


tea 


gat 


gca 


aat 


gat 


gta 


gtt 


att 


act 


495 




Thr 


Ala 


His 


Ser 


val 


Leu 


Gly 


Ser 


Asp 


Ala 


Asn 


Asp 


Val 


Val 


He 


Thr 




20 








130 










135 










140 










gtc 


ccg 


ttt 


gat 


ttt 


gga 


gaa 


aag 


caa 


aaa 


aat 


get 


ctt 


gga 


gaa 


gca 


543 




Val 


Pro 


Phe 
145 


Asp 


Phe 


Gly 


Glu 


Lys 
150 


Gin 


Lys 


Asn 


Ala 


Leu 
155 


Gly 


Glu 


Ala 






act 


aga 


get 


get 


gga 


ttt 


aat 


gtt 


ttg 


cga 


tta 


att 


cac 


gaa 


ccg 


tct 


591 


25 


Ala 


Arq 
160 


Ala 


Ala 


Gly 


Phe 


Asn 
165 


Val 


Leu 


Arg 


Leu 


He 
170 


His 


Glu 


Pro 


Ser 






gca 


get 


ctt 


ctt 


get 


tat 


gga 


att 


gga 


caa 


gac 


tec 


cct 


act 


gga 


aaa 


639 




Ala 


Ala 


Leu 


Leu 


Ala 


Tyr 


Gly 


He 


Gly Gin Asp 


Ser 


Pro 


Thr 


Gly Lys 






175 










180 










185 










190 




30 


aqc 


aat 


att 


ttg 


gtg 


ttt 


aag 


ctt 


gga 


gga 


aca 


tec 


tta 


tct 


etc 


age 


687 




Ser 


Asn 


He 


Leu 


Val 


Phe 


Lys 


Leu 


Gly 


Gly Thr 


Ser 


Leu 


Ser 


Leu 


Ser 














195 










200 










205 








gtc 


atg 


gaa 


gtt 


aac 


agt 


gga 


ata 


tat 


egg 


gtt 


ctt 


tea 


aca 


aac 


act 


735 




Val 


Met 


Glu 


Val 


Asn 


Ser 


Gly 


He 


Tyr 


Arg 


Val 


Leu 


Ser 


Thr 


Asn 


Thr 




35 








210 










215 










220 










aat 


qat 


aac 


ate 


ggt 


ggt 


gca 


cat 


ttc 


aca 


gaa 


acc 


tta 


gca 


cag 


tat 


783 




Asp 


Asp 


Asn 


He 


Gly Gly Ala 


His 


Phe 


Thr 


Glu 


Thr 


Leu 


Ala 


Gin 


Tyr 










225 










230 










235 












eta 


get 


tct 


g a g 


ttc 


caa 


aga 


tec 


ttc 


aaa 


cat 


gat 


gtg 


aga 


gga 


aat 


831 


40 


Leu 


Ala 
240 


Ser 


Glu 


Phe 


Gin 


Arg 
245 


Ser 


Phe 


Lys 


His 


Asp 
250 


Val 


Arg 


Gly 


Asn 






9 C 9 


cga 


gee 


atg 


atg 


aaa 


tta 


acg 


aac 


agt 


get 


gaa 


gta 


gcg 


aaa 


cat 


879 




Ala 


Arg 


Ala 


Met 


Met 


Lys 


Leu 


Thr 


Asn 


Ser 


Ala 


Glu 


Val 


Ala 


Lys 


His 






255 








260 










265 










270 




45 


tct 


ttg 


tea 


acc 


ttg 


gga 


agt 


gee 


aac 


tgt 


ttt 


ctt 


gac 


tea 


tta 


tat 


927 




Ser 


Leu 


Ser 


Thr 


Leu 
275 


Gly 


Ser 


Ala 


Asn 


Cys 
280 


Phe 


Leu 


Asp 


Ser 


Leu 
285 


Tyr 






gaa 


ggt 


caa 


gat 


ttt 


gat 


tgc 


aat 


gtg 


tec 


aga 


gca 


aga 


ttt 


gaa 


ctt 


975 




Glu 


Gly Gin Asp 


Phe 


Asp 


Cys 


Asn 


Val 


Ser 


Arg 


Ala 


Arg 


Phe 


Glu 


Leu 




50 








290 










295 










300 










ctt 


tgt 


tct 


cca 


ctt 


ttt 


aat 


aag 


tgt 


ata 


gaa 


gca 


ate 


aga 


gga 


etc 


1023 




Leu 


Cys 


Ser 


Pro 


Leu 


Phe 


Asn 


Lys 


Cys 


He 


Glu 


Ala 


lie 


Arg 


Gly Leu 










305 










310 










315 












tta 


gat 


caa 


aat 


gga 


ttt 


aca 


aca 


gat 


gat 


ate 


aac 


aag 


gtt 


gtc 


ctt 


1071 


55 


Leu 


Asp 
320 


Gin 


Asn 


Gly 


Phe 


Thr 
325 


Thr 


Asp 


Asp 


He 


Asn 
330 


Lys 


Val 


Val 


Leu 






tgt 


gga 


ggg 


tct 


tct 


cga 


ate 


cca 


aag 


eta 


cag 


caa 


ctg 


att 


aaa 


gat 


1119 




Cys 


Gly Gly 


Ser 


Ser 


Arg 


lie 


Pro 


Lys 


Leu 


Gin 


Gin 


Leu 


He 


Lys 


Asp 






335 










340 










345 










350 




60 


ctt 


ttc 


cca 


get 


gtt 


gag 


ctt 


etc 


aat 


tct 


ate 


cct 


cct 


gat 


gaa 


gtg 


1167 




Leu 


Phe 


Pro 


Ala 


Val 
355 


Glu 


Leu 


Leu 


Asn 


Ser 
360 


He 


Pro 


Pro 


Asp 


Glu 
365 


Val 






ate 


cct 


att 


ggt 


gca 


get 


ata 


gaa 


gca 


gga 


att 


ctt 


att 


ggg 


aaa 


gaa 


1215 
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10 



15 



20 



25 



30 



35 



lie Pro He Gly Ala Ala He Glu Ala Gly He 

370 375 
aac ctg ttg gtg gaa gac tct ctt atg ata gag 
Asn Leu Leu Val Glu Asp Ser Leu Met He Glu 

385 390 
att tta gtt aag ggt gtg gac gaa tea gga gec 
He Leu Val Lys Gly Val Asp Glu Ser Gly Ala 

400 405 
ctg ttt cca tea ggg act cct ttg cca get cga 
Leu Phe Pro Ser Gly Thr Pro Leu Pro Ala Arg 
415 420 425 

caa gee cct gga age ata tct tea gtg tgc ctt 
Gin Ala Pro Gly Ser He Ser Ser Val Cys Leu 

435 440 
gat ggg aag aac tct gee aaa gag gaa ace aag 
Asp Gly Lys Asn Ser Ala Lys Glu Glu Thr Lys 

450 455 
etc cag gat tta gat aaa aaa gaa aat gga tta 
Leu Gin Asp Leu Asp Lys Lys Glu Asn Gly Leu 

465 470 
gtt ctt act atg aaa agg gat gga tct tta cat 
Val Leu Thr Met Lys Arg Asp Gly Ser Leu His 

480 485 
caa gaa act gga aaa tgt gaa gca ate tct att 
Gin Glu Thr Gly Lys Cys Glu Ala He Ser He 
495 500 505 

tagtgtttta gagaaatcaa gaatttttaa aaacaagaat 
ataagtggtg tttgtattaa aatacttttt caatgaactg 
actacaatat atcagtaaaa aaaaaaaaaa a 

<210> 149 
<211> 658 
<212> DNA 

<213> Homo sapiens 



Leu 

tgt 
Cys 

agt 
Ser 
410 
aga 
Arg 

gaa 
Glu 

ttt 
Phe 

cgt 
Arg 

gtg 
val 
490 
gag 
Glu 



He Gly Lys Glu 
380 

tea gee aga gat 
Ser Ala Arg Asp 
395 

aga ttc aca gtg 
Arg Phe Thr Val 



caa 
Gin 

etc 
Leu 

gca 
Ala 

gat 
Asp 
475 
aca 
Thr 



cac 
His 

tat 
Tyr 

cag 
Gin 
460 
ata 
He 



aca 
Thr 

gag 
Glu 
445 
gtt 
Val 



ttg 
Leu 
430 
tct 
Ser 

gta 
Val 



tta get 
Leu Ala 



tgc aca gat 
Cys Thr Asp 



ata gca tct 
He Ala Ser 



atcaacattt ggttttgtgt 
tataaactat gttttattaa 



1263 



1311 



1359 



1407 



1455 



1503 



1551 



1596 



1656 
1716 
1747 



<220> 
<221> CDS 
<222> 129. 



362 



40 <400> 149 

agtcaggaaa atgaagctga acatcaagtc ccagcaagaa aagaaggaaa aggaatgeca 
agcagtcatg ttagcagttt gaaggggctg gagcaagatg gaatcaggaa taaggagtca 
gtgggacc atg tac aac act gga aga cac gta tec ctt cgc ctg gac aag 
Met Tyr Asn Thr Gly Arg His Val Ser Leu Arg Leu Asp Lys 

45 1 5 10 

gag cac ttg gtc aac ata tct gga ggg ccc atg aca tac age cac egg 
Glu His Leu Val Asn lie Ser Gly Gly Pro Met Thr Tyr Ser His Arg 
15 20 25 30 

ctg gag gag ate cga eta cac ttt ggg agt gag gac age caa ggg teg 

50 Leu Glu Glu He Arg Leu His Phe Gly Ser Glu Asp Ser Gin Gly Ser 

35 40 45 

gag cac etc etc aat gga cag gee ttc tct ggg gag ctt caa gag agg 
Glu His Leu Leu Asn Gly Gin Ala Phe Ser Gly Glu Leu Gin Glu Arg 
50 55 60 

55 gat ttg ttc ate ttg ttg act tct gta tea gga cat ctg ccc gat aca 
Asp Leu Phe He Leu Leu Thr Ser Val Ser Gly His Leu Pro Asp Thr 

65 70 75 

tagaaaaagt ctgctgaccc ctgaattaca gtatgageca ttcggaatgc atttctcttt 
aaaagttctc gcctcattca gtgtctggaa cacagtgggt gctccccaat aggtgacacc 

60 ttcctcaagt ttccttggga gaacagactc aatgtcggat ccacaaagga gacctgcaca 
tacctaaccc etatttctge agaagctgaa ggctgtatta tetattgett gcataataaa 
tattgeataa cgacaacaat agtaaaaaaa aaaaaaaaaa gaaaaaaaaa aaaaaa 



60 
120 
170 



218 



266 



314 



362 



422 
482 
542 
602 
658 
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<210> 150 
<211> 2045 
<212> DNA 

<213> Homo sapiens 

5 

<220> 
<221> CDS 
<222> 109. .594 

10 <400> 150 

attattacta caggaaaaac tgttctcttc tgtggcacag 
aagtagcagt tccggagtcc agctggctaa aactcatccc 



15 cat gcc tta gaa ate get ggg ctg ttt ctt ggt 
His Ala Leu Glu lie Ala Gly Leu Phe Leu Gly 

5 10 
ggc aca gtg get gtc act gtc atg cct cag tgg 
Gly Thr Val Ala Val Thr Val Met Pro Gin Trp 

20 20 25 30 

att gaa aac aac ate gtg gtt ttt gaa aac ttc 
lie Glu Asn Asn lie Val Val Phe Glu Asn Phe 

40 45 
atg aat tgc gtg agg cag get aac ate agg atg 

25 Met Asn Cys Val Arg Gin Ala Asn lie Arg Met 
55 60 
gat tec ctg ctg get ctt tct ccg gac eta cag 
Asp Ser Leu Leu Ala Leu Ser Pro Asp Leu Gin 
70 75 

30 atg tgt get get tec gtg atg tec ttc ttg get 
Met Cys Ala Ala Ser Val Met Ser Phe Leu Ala 

85 90 
ctt ggc atg aaa tgc acc agg tgc acg ggg gac 
Leu Gly Met Lys Cys Thr Arg Cys Thr Gly Asp 

35 100 105 110 

get cac att ctg ctg acg get gga ate ate ttc 
Ala His He Leu Leu Thr Ala Gly He He Phe 

120 125 
gtg gtg etc ate cct gtg age tgg gtt gcc aat 

40 Val Val Leu He Pro Val Ser Trp Val Ala Asn 
135 140 
ttc tat aac cca ata gtg aat gtt gcc caa aaa 
Phe Tyr Asn Pro He Val Asn Val Ala Gin Lys 
150 155 

45 taagctctct acttaggatg gaccacggca ctggtgctga 
tgctgcgttt tttgttgcaa cgaaaagagc agtagctaca 
cgcacaaccc aaaaaagtta tcacaccgga aagaagtcac 
cagtatgtgt agttgtgtat gtttttttaa ctttactata 
atctatatta ctttctcaaa atggacccca aagaaacttt 

50 ctaatcttaa ttacaggaac tgtgcatcag ctatttatga 
gaatgagata ttaaatccaa tgctttgatt gttctagaaa 
aggtggttca agcatctact ctttttatca tttacttcaa 
cattatttta ctactgtaat ttctccacga catagcatta 
atttatatct cacatagaga catgettata tggttttatt 

55 ttacactgaa taaatagaac tcaactattg cttttcaggg 
gaaggttact attaattgtt taaaaacagc ttagggatta 
agattaaaat gaaggcttta atcagcattg taaaggaaat 
tgttttttag cctaggagtt agaaatccta acttctttat 
ttttttcttg tgtattaaat taacattttt aaaaagcaga 

60 cattcaaact gcttttccag ggctatactc agaagaaaga 
aagtgatggt tttaggaaag tgaaaatatt tttgtttttg 
attttgacaa gaaatcatat atgtatggat atattttaat 
tgaggtttca tcaatataaa taaaagagca gaaaaatatg 



agaaccctgc ttcaaagcag 60 
agaggata atg gca acc 117 

Met Ala Thr 

1 

ggt gtt gga atg gtg 165 

Gly Val Gly Met Val 

15 

ata gtg teg gcc ttc 213 
He Val Ser Ala Phe 
35 

tgg gaa gga ctg tgg 261 
Trp Glu Gly Leu Trp 
50 

cag tgc aaa ate tat 309 
Gin Cys Lys He Tyr 
65 

gca gcc aga gga ctg 357 
Ala Ala Arg Gly Leu 
80 

ttc atg atg gcc ate 405 

Phe Met Met Ala He 

95 

aat gag aag gtg aag 453 
Asn Glu Lys Val Lys 
115 

ate ate acg ggc atg 501 
He He Thr Gly Met 
130 

gcc ate ate aga gat 54 9 

Ala He He Arg Asp 
145 . 

cgt gag ctt gga 5 94 

Arg Glu Leu Gly 
160 

ttgttggagg agctctgttc 654 
gatactcgat accttcccat 714 
egagegtcta ctccagaagt 774 
aagecatgea aatgacaaaa 834 
gatttactgt tettaactge 894 
ttctataagc tatttcagca 954 
gtatagtaat ttgttttcta 1014 
aatgacattg ctaaagactg 1074 
tgtacataga tgagtgtaac 1134 
taaaatgaaa tgccagtcca 1194 
aaatcatgga tagggttgaa 1254 
atgtcctcca tttataatga 1314 
tgaatggctt tetgatatge 1374 
cctcttctcc cagaggcttt 1434 
tattttgtca aggggctttg 1494 
taaaagtgtg atctaagaaa 1554 
tatttgaaga agaatgatgc 1614 
aagtatttga gtacagactt 1674 
tcttggtttt catttgetta 1734 
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ccaaaaaaac aacaacaaaa aaagttgtcc tttgagaact tcacctgctc ctatgtgggt 1794 

acctgagtca aaattgtcat ttttgttctg tgaaaaataa atttccttct tgtaccattt 1854 

ctgtttagtt ttactaaaat ctgtaaatac tgtatttttc tgtttattcc aaatttgatg 1914 

aaactgacaa tccaatttga aagtttgtgt cgacgtctgt ctagcttaaa tgaatgtgtt 1974 

5 ctatttgctt tatacattta tattaataaa ttgtacattt ttctaattat ttggaaaaaa 2034 

aaaaaaaaaa a 2045 

<210> 151 

<211> 788 

10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 150. .587 

<400> 151 

attttcaaat ttacccctct gtgacttgta agccatgcaa ttcgtagggc taaatatgca 60 
gttgttcgat ttcacggttt ggaatctctt gtcaagggac tgggactctt caattaatct 120 
20 gacatttcac aaatccaaaa ttgccgtgg atg aac tct tta ctt cac ttc ggg 173 

Met Asn Ser Leu Leu His Phe Gly 
1 5 

ata ttg ctg gag ctg agt etc ctg aaa cag ttt aag tct gta tat gtt 221 
lie Leu Leu Glu Leu Ser Leu Leu Lys Gin Phe Lys Ser Val Tyr Val 

25 10 15 20 

cct gga aat cat acc cac cag gca tct tat aag cca ttg ttg aag caa 269 

Pro Gly Asn His Thr His Gin Ala Ser Tyr Lys Pro Leu Leu Lys Gin 

25 30 35 40 

gtt gtg gag gaa ata ttt cat ccc gag agg cca gat tec gtt gat att 317 

30 Val Val Glu Glu lie Phe His Pro Glu Arg Pro Asp Ser Val Asp lie 

45 50 55 

gaa cac atg tct tea ggc etc act gat etc ctt aaa act gga ttt age 365 
Glu His Met Ser Ser Gly Leu Thr Asp Leu Leu Lys Thr Gly Phe Ser 
60 65 70 

35 atg ttc atg aag gtg age egg cct cat cct agt gac tac ccc etc ctg 413 
Met Phe Met Lys Val Ser Arg Pro His Pro Ser Asp Tyr Pro Leu Leu 

75 80 85 

ate etc ttt gtg gta ggt ggg gtc aca gtc tct gaa gtg aaa atg gtc 461 
lie Leu Phe Val Val Gly Gly Val Thr Val Ser Glu Val Lys Met Val 

40 90 95 100 

aaa gat ctt gtg gca teg ttg aag cca gga acc cag gta ate gtg ctg 509 

Lys Asp Leu Val Ala Ser Leu Lys Pro Gly Thr Gin Val lie Val Leu 

105 110 115 120 

tec aca cga etc ctg aag cca ctt aac att cct gag ctg tta ttt gca 557 

45 Ser Thr Arg Leu Leu Lys Pro Leu Asn lie Pro Glu Leu Leu Phe Ala 

125 130 135 

act gac cga ctg cat cca gac ctt ggc ttc tgagcatccg ctaagaagat 607 
Thr Asp Arg Leu His Pro Asp Leu Gly Phe 
140 145 

50 aagacctact caagctggaa atgccgatgc aattttctgc caccactcca aatactcctc 667 
cacaaccagc gtccctgtca etaattgega gaatgatgga attctgectg aagggtcttg 727 
atacctactc agtgaggtac stttgcttgg attgctgtga ttccaaaaaa aaaaaaaaaa 787 
a 788 

55 <210> 152 

<211> 1931 

<212> DNA 

<213> Homo sapiens 

60 <220> 

<221> CDS 
<222> 173 . .847 
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<400> 152 

actggatact atctggccag aagtagcaaa gcagctctta 
cgagttcatt actacaggaa aaactgttct cttctgtggc 
gcagaagtag cagttccgga gtccagctgg ctaaaactca 



acc 
Thr 

10 gtg 
Val 

ttc 
Phe 
15 35 
tgg 
Trp 



40 



45 



cat gcc 
His Ala 
5 

ggc aca 
Gly Thr 
20 

att gaa 
lie Glu 

atg aat 
Met Asn 



tat gat tec 
20 Tyr Asp Ser 



ctg 
Leu 

25 ate 
He 

aag 
Lys 
30 115 
atg 
Met 



atg tgt 
Met Cys 

85 
ctt ggc 
Leu Gly 
100 

get cac 
Ala His 

gtg gtg 
Val Val 



tta gaa ate get ggg ctg ttt ctt 
Leu Glu He Ala Gly Leu Phe Leu 
10 

gtg get gtc act gtc atg cct cag 
Val Ala Val Thr Val Met Pro Gin 
25 

aac aac ate gtg gtt ttt gaa aac 
Asn Asn He Val Val Phe Glu Asn 

40 45 
tgc gtg agg cag get aac ate agg 
Cys Val Arg Gin Ala Asn He Arg 

55 60 
ctg ctg get ctt tct ccg gac eta 
Leu Leu Ala Leu Ser Pro Asp Leu 
70 75 
get get tec gtg atg tec ttc ttg 
Ala Ala Ser Val Met Ser Phe Leu 
90 

atg aaa tgc acc agg tgc acg ggg 
Met Lys Cys Thr Arg Cys Thr Gly 
105 

att ctg ctg acg get gga ate ate 
He Leu Leu Thr Ala Gly He 



120 



He 
125 



etc ate cct gtg age tgg gtt gcc 
Leu He Pro Val Ser Trp Val Ala 



135 



140 



gat ttc tat 
35 Asp Phe Tyr 



aac cca ata gtg aat gtt gcc caa 
Asn Pro He Val Asn Val Ala Gin 



150 



155 



gaa get 
Glu Ala 

gga get 
Gly Ala 
180 
tac aga 
Tyr Arg 
195 

acc gga 
Thr Gly 



etc 
Leu 
165 
ctg 
Leu 

tac 
Tyr 

aag 
Lys 



tagttgtgta 
50 actttctcaa 
attacaggaa 
attaaatcca 
aagcatctac 
actactgtaa 
55 tcacatagag 
ataaatagaa 
tattaattgt 
tgaaggcttt 
gectaggagt 
60 gtgtattaaa 
tgcttttcca 
ttttaggaaa 
agaaatcata 



tac tta gga tgg acc acg gca ctg 
Tyr Leu Gly Trp Thr Thr Ala Leu 
170 

ttc tgc tgc gtt ttt tgt tgc aac 
Phe Cys Cys Val Phe Cys Cys Asn 
185 

teg ata cct tct cat cgc aca acc 
Ser He Pro Ser His Arg Thr Thr 
200 205 
aag tea ccg age gtc tac tec aga 
Lys Ser Pro Ser Val Tyr Ser Arg 
215 220 
tgttttttta actttactat aaagctatgc 
aatggacccc aaagaaactt tgatttactg 
ctgtgcatca gctatttatg attctataag 
atgetttgat tgttctagaa agtatagtaa 
tctttttatc atttacttca aaatgacatt 
tttctccatg acatagcatt atgtacatag 
acatgettat atggttttat ttaaaatgaa 
ctcaactatt gcttttcagg gaaatcatgg 
ttaaaaacag cttagggatt aatgtcctcc 
aatcagcatt gtaaaggaaa ttgaatggct 
tagaaatcct aacttcttta tcctcttctc 
ttaacatttt taaaaagcag atattttgtc 
gggctatact cagaagaaag ataaaagtgt 
gtgaaaatat ttttgttttt gtatttgaag 
tatgtatgga tatattttaa taagtatttg 



tttgaaaaac cactgggttc 
acagagaacc ctgcttcaaa 
tcccagagga ta atg gca 

Met Ala 

1 

ggt ggt gtt gga atg 
Gly Gly Val Gly Met 
15 

tgg aga gtg teg gcc 
Trp Arg Val Ser Ala 
30 

ttc tgg gaa gga ctg 
Phe Trp Glu Gly Leu 
50 

atg cag tgc aaa ate 
Met Gin Cys Lys He 
65 

cag gca gcc aga gga 
Gin Ala Ala Arg Gly 
80 

get ttc atg atg gcc 
Ala Phe Met Met Ala 
95 

gac aat gag aag gtg 
Asp Asn Glu Lys Val 
110 

ttc ate ate gcg ggc 
Phe He He Ala Gly 
130 

aat gcc ate ate aga 
Asn Ala He He Arg 
145 

aaa cgt gag ctt gga 
Lys Arg Glu Leu Gly 
160 

gtg ctg att gtt gga 
Val Leu He Val Gly 
175 

gaa aag age agt age 
Glu Lys Ser Ser Ser 
190 

caa aaa agt tat cac 
Gin Lys Ser Tyr His 
210 

agt cag tat gtg 
Ser Gin Tyr Val 
225 

aaatgacaaa aatctatatt 
ttcttaactg cctaatctta 
ctatttcagc agaatgagat 
tttgttttct aaggtggttc 
gctaaagact gcattatttt 
atgagtgtaa catttatatc 
atgccagtcc attacactga 
atagggttga agaaggttac 
atttataatg aagattaaaa 
ttctgatatg ctgtttttta 
ccagaggctt tttttttctt 
aaggggcttt gcattcaaac 
gatctaagaa aaagtgatgg 
aagaatgatg cattttgaca 
agtacagact ttgaggtttc 



60 
120 
178 



226 



274 



322 



370 



418 



466 



514 



562 



610 



658 



706 



754 



802 



847 



907 
967 
1027 
1087 
1147 
1207 
1267 
1327 
1387 
1447 
1507 
1567 
1627 
1687 
1747 
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10 



atcaatataa ataaaagagc agaaaaatat gtcttggttt tcatttgctt accaaaaaaa 1807 

caacaacaaa aaaagttgtc ctttgagaac ttcacctgct cctatgtggg tacctgagtc 1867 

aaaattgtca tttttgttct gtgaaaaata aatttccttc ttgtaccaaa aaaaaaaaaa 1927 

aaaa 1931 

<210> 153 

<211> 514 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 100 . .441 

15 <400> 153 

ataccaggca ctttagaacc agagactctg ctgcttttcc tgggcagggc ctgcttgctc 60 
cagctctcaa gtctgacttg catctacact gcgggcaag atg egg ctg caa gac 114 

Met Arg Leu Gin Asp 
1 5 

20 cgc ate gee acg ttc ttc ttc cca aaa ggc atg atg etc ace acg get 162 
Arg lie Ala Thr Phe Phe Phe Pro Lys Gly Met Met Leu Thr Thr Ala 

10 15 20 

gcg ctg atg etc ttc ttc tta cac ctg ggc ate ttc ate aga gac gtg 210 
Ala Leu Met Leu Phe Phe Leu His Leu Gly lie Phe lie Arg Asp Val 

25 25 30 35 

cac aac ttc tgc ate ace tac cac tat gac cac atg age ttt cac tac 258 
His Asn Phe Cys lie Thr Tyr His Tyr Asp His Met Ser Phe His Tyr 

40 45 50 

acg gtc gtc ctg atg ttc tec cag gtg ate age ate tgc tgg get gee 306 

30 Thr Val Val Leu Met Phe Ser Gin Val lie Ser lie Cys Trp Ala Ala 
55 60 65 

atg ggg tea etc tat get gag atg aca gaa aac aat get caa egg age 354 
Met Gly Ser Leu Tyr Ala Glu Met Thr Glu Asn Asn Ala Gin Arg Ser 
70 75 80 85 

35 cat gtt ctt caa ccg cct gtc ctt gga gtt tct ggc cat cga gta ccg 402 
His Val Leu Gin Pro Pro Val Leu Gly Val Ser Gly His Arg Val Pro 

90 95 100 

gga gga gca cca ctg agg cct ggg gag teg gaa cag ggc taaggagggg 451 
Gly Gly Ala Pro Leu Arg Pro Gly Glu Ser Glu Gin Gly 

40 105 110 

gaagcaaaag gctgcctcgg gtgttttaat aaagttgttg tttattccaa aaaaaaaaaa 511 
aaa 514 

<210> 154 
45 <211> 1183 
<212> DNA 
<213> Homo sapiens 

<220> 
50 <221> CDS 

<222> 32 . . 1132 

<400> 154 

acttctttcc tgectctgat teegggctgt c atg gcg ace ccc aac aat ctg 52 
55 Met Ala Thr Pro Asn Asn Leu 

1 5 

ace ccc ace aac tgc age tgg tgg ccc ate tec gcg ctg gag age gat 100 
Thr Pro Thr Asn Cys Ser Trp Trp Pro lie Ser Ala Leu Glu Ser Asp 
10 15 20 

60 gcg gee aag cca gcg gag gee ccc gac get ccc gag gcg gee age ccc 148 
Ala Ala Lys Pro Ala Glu Ala Pro Asp Ala Pro Glu Ala Ala Ser Pro 

25 30 35 

gee cat tgg ccc agg gag age ctg gtt ctg tac cac tgg ace cag tec 196 
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Ala 


His 


Trp 


Pro 


Arg 


Glu 


Ser 


Leu 


Val 


Leu 


Tyr 


His 


Trp 


Thr 


Gin 


Ser 






40 








45 










50 










55 






ttc 


age 


teg 


cag 


aag 


gtg 


egg 


ctg 


gtg 


ate 


gee 


gag 


aag 


ggc 


ctg 


gtg 


244 




Phe 


Ser 


Ser 


Gin 


Lys 


Val 


Arg 


Leu 


Val 


He 


Ala 


Glu 


Lys 


Gly 


Leu 


Val 




5 










60 










65 










70 








tgc 


gag 


gag 


egg 


gac 


gtg 


age 


ctg 


cca 


cag 


age 


gag 


cac 


aag 


gag 


ccc 


292 




Cys 


Glu 


Glu 


Arg 


Asp 


Val 


Ser 


Leu 


Pro 


Gin 


Ser 


Glu 


His 


Lys 


Glu 


Pro 










75 










80 










85 










tgg 


ttc 


atg 


egg 


etc 


aac 


ctg 


ggc 


gag 


gag 


gtg 


ccc 


gtc 


ate 


ate 


cac 


340 


10 


Trp 


Phe 


Met 
90 


Arg 


Leu 


Asn 


Leu 


Gly 
95 


Glu 


Glu 


Val 


Pro 


val 

100 


He 


He 


His 






cgc 


gac 


aac 


ate 


ate 


agt 


gac 


tat 


gac 


cag 


ate 


att 


gac 


tat 


gtg 


gag 


388 




Arg 


Asp 
105 


Asn 


He 


He 


Ser 


Asp 
110 


Tyr 


Asp 


Gin 


He 


He 
115 


Asp 


Tyr 


Val 


Glu 




15 


cgc 


acc 


ttc 


aca 


gga 


gag 


cac 


gtg 


gtg 


gee 


ctg 


atg 


ccc 


gag 


gtg 


ggc 


. 436 




Arg 


Thr 


Phe 


Thr 


Gly 


Glu 


His 


Val 


Val 


Ala 


Leu 


Met 


Pro 


Glu 


Val 


Gly 






120 










125 










130 










135 






age 


ctg 


cag 


cac 


gca 


egg 


gtg 


ctg 


cag 


tac 


egg 


gag 


ctg 


ctg 


gac 


gca 


484 




Ser 


Leu 


Gin 


His 


Ala 


Arg 


Val 


Leu 


Gin 


Tyr 


Arg 


Glu 


Leu 


Leu 


Asp 


Ala 




20 










140 










145 










150 








ctg 


ccc 


atg 


gat 


gee 


tac 


acg 


cat 


ggc 


tgc 


ate 


ctg 


cat 


ccc 


gag 


etc 


532 




Leu 


Pro 


Met 


Asp 
155 


Ala 


Tyr 


Thr 


His 


Gly 
160 


Cys 


He 


Leu 


His 


Pro 
165 


Glu 


Leu 






acc 


acc 


gac 


tec 


atg 


ate 


ccc 


aag 


tac 


gee 


acg 


gee 


gag 


ate 


cgc 


aga 


580 


25 


Thr 


Thr 


Asp 
170 


Ser 


Met 


He 


Pro 


Lys 
175 


Tyr 


Ala 


Thr 


Ala 


Glu 
180 


He 


Arg 


Arg 






cat 


tta 


gee 


aat 


gee 


acc 


acg 


gac 


etc 


atg 


aaa 


ctg 


gac 


cat 


gaa 


gag 


628 




His 


Leu 
185 


Ala 


Asn 


Ala 


Thr 


Thr 
190 


Asp 


Leu 


Met 


Lys 


Leu 
195 


Asp 


His 


Glu 


Glu 




30 


gag 


ccc 


cag 


etc 


tec 


gag 


ccc 


tac 


ctt 


tct 


aaa 


caa 


aag 


aag 


etc 


atg 


676 




Glu 


Pro 


Gin 


Leu 


Ser 


Glu 


Pro 


Tyr 


Leu 


Ser 


Lys 


Gin 


Lys 


Lys 


Leu 


Met 






200 










205 










210 










215 






gtc 


aag 


ate 


ttg 


gag 


cat 


gat 


gat 


gtg 


age 


tac 


ctg 


aag 


aag 


ate 


etc 


724 




Val 


Lys 


He 


Leu 


Glu 


His 


Asp 


Asp 


Val 


Ser 


Tyr 


Leu 


Lys 


Lys 


He 


Leu 




35 










220 










225 










230 








ggg 


gaa 


ctg 


gee 


atg 


gtg 


ctg 


gac 


cag 


att 


gag 


gcg 


gag 


ctg 


gag 


aag 


772 




Gly Glu 


Leu 


Ala 


Met 


Val 


Leu 


Asp 


Gin 


He 


Glu 


Ala 


Glu 


Leu 


Glu 


Lys 












235 










240 










245 










agg 


aag 


ctg 


gag 


aac 


gag 


ggg 


cag 


aaa 


tgc 


gag 


ctg 


tgg 


etc 


tgt 


ggc 


820 


40 


Arg 


Lys 


Leu 
250 


Glu 


Asn 


Glu 


Gly 


Gin 
255 


Lys 


Cys 


Glu 


Leu 


Trp 
260 


Leu 


Cys 


Gly 






tgt 


gec 


ttc 


acc 


etc 


get 


gat 


gtc 


etc 


ctg 


gga 


gee 


acc 


ctg 


cac 


cgc 


868 




Cys 


Ala 
265 


Phe 


Thr 


Leu 


Ala 


Asp 
270 


Val 


Leu 


Leu 


Gly 


Ala 
275 


Thr 


Leu 


His 


Arg 




45 


etc 


aag 


ttc 


ctg 


gga 


ctg 


tec 


aag 


aaa 


tac 


tgg 


gaa 


gat 


ggc 


age 


egg 


916 




Leu 


Lys 


Phe 


Leu 


Gly 


Leu 


Ser 


Lys 


Lys 


Tyr 


Trp 


Glu 


Asp 


Gly 


Ser 


Arg 






280 










285 










290 










295 






ccc 


aac 


ctg 


cag 


tec 


ttc 


ttt 


gag 


agg 


gtc 


cag 


aga 


cgc 


ttt 


gee 


ttc 


964 




Pro 


Asn 


Leu 


Gin 


Ser 


Phe 


Phe 


Glu 


Arg 


Val 


Gin 


Arg 


Arg 


Phe 


Ala 


Phe 




50 










300 










305 










310 








egg 


aaa 


gtc 


ctg 


ggt 


gac 


ate 


cac 


acc 


acc 


ctg 


ctg 


teg 


gee 


gtc 


ate 


1012 




Arg 


Lys 


Val 


Leu 


Gly Asp 


He 


His 


Thr 


Thr 


Leu 


Leu 


Ser 


Ala 


Val 


He 












315 










320 










325 










ccc 


aat 


get 


ttc 


egg 


ctg 


gtc 


aag 


agg 


aaa 


ccc 


cca 


tec 


ttc 


ttc 


ggg 


1060 


55 


Pro 


Asn 


Ala 
330 


Phe 


Arg 


Leu 


Val 


Lys 
335 


Arg 


Lys 


Pro 


Pro 


Ser 
340 


Phe 


Phe 


Gly 






gcg 


tec 


ttc 


etc 


atg 


ggc 


tec 


ctg 


ggt 


ggg 


atg 


ggc 


tac 


ttt 


gee 


tac 


1108 




Ala 


Ser 
345 


Phe 


Leu 


Met 


Gly 


Ser 
350 


Leu 


Gly 


Gly 


Met 


Gly 
355 


Tyr 


Phe 


Ala 


Tyr 




60 


tgg 


tac 


etc 


aag 


aaa 


aaa 


tac 


ate 


tagggecagg cctggggctt ggtgtctgac 


1162 




Trp 


Tyr 


Leu 


Lys 


Lys 


Lys 


Tyr 


He 






















360 










365 

























tgtcaaaaaa aaaaaaaaaa a 1183 

162 



BNSDOCID: <WO 01 42451 A2_L> 



WO 01/42451 



PCT/1B00/01938 



<210> 155 
<211> 1545 
<212> DNA 
5 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 160 . . 996 

10 

<400> 155 

acacagcatg catttcttca acaagcgact cagaaggcac ttgcacatcg ttgctgttct 60 
gcctctttgc ttcagcatga ttacccagag gcgcacccgt gccgtggcct gcccgtcgtc 120 
tatgcacccg tgctgtggcg tgcccgtcgt ctgtgtggc atg cct gtc tgt gca 174 
15 Met Pro Val Cys Ala 

1 5 

ccc gtg ctg tgg cgt gcc cgt cgt ctg tgt ggc atg cct gtc tgt gca 222 
Pro Val Leu Trp Arg Ala Arg Arg Leu Cys Gly Met Pro Val Cys Ala 
10 15 20 

20 ccc gtg ccg tgg cgt gcc cgt cgt ctg tgc acc cgt get gtg gtg tgc 270 
Pro Val Pro Trp Arg Ala Arg Arg Leu Cys Thr Arg Ala Val Val Cys 

25 30 35 

cct teg tct gtt cct ttt att gcc ggg cag ggt tgc acc cac atg tgc 318 
Pro Ser Ser Val Pro Phe lie Ala Gly Gin Gly Cys Thr His Met Cys 

25 40 45 50 

aag cca gcg acg gac ccc agg ttc acc cgt tea ccg ctg get gga ggc 3 66 

Lys Pro Ala Thr Asp Pro Arg Phe Thr Arg Ser Pro Leu Ala Gly Gly 

55 60 65 

gtg ate ctg ggt gtg gcc ctg tgg etc cgc cat gac ccg cag acc acc 414 

30 Val He Leu Gly Val Ala Leu Trp Leu Arg His Asp Pro Gin Thr Thr 
70 75 80 85 

aac etc ctg tat ctg gag ctg gga gac aag ccc gcg ccc aac acc ttc 462 
Asn Leu Leu Tyr Leu Glu Leu Gly Asp Lys Pro Ala Pro Asn Thr Phe 
90 95 100 

35 tat gta ggc ate tac ate etc ate get gtg ggc get gtc atg atg ttc 510 
Tyr Val Gly He Tyr lie Leu He Ala Val Gly Ala Val Met Met Phe 

105 110 115 

gtt ggc ttc ctg ggc tgc tac ggg gcc ate cag gaa tec cag tgc ctg 558 
Val Gly Phe Leu Gly Cys Tyr Gly Ala He Gin Glu Ser Gin Cys Leu 

40 120 125 130 

ctg ggg acg ttc ttc act tgc ctg gtc ate ctg ttt gcc tgt gag gtg 606 
Leu Gly Thr Phe Phe Thr Cys Leu Val He Leu Phe Ala Cys Glu Val 

135 140 145 

gcc gcc ggc ate tgg ggc ttt gtc aac aag gac cag ate gcc aag gat 654 

45 Ala Ala Gly He Trp Gly Phe Val Asn Lys Asp Gin He Ala Lys Asp 
150 155 160 165 

gtg aag cag ttc tat gac cag gcc eta cag cag gcc gtg gtg gat gat 702 
Val Lys Gin Phe Tyr Asp Gin Ala Leu Gin Gin Ala Val Val Asp Asp 
170 175 180 

50 gac gcc aac aac gcc aag get gtg gtg aag acc ttc cac gag acg ctt 750 
Asp Ala Asn Asn Ala Lys Ala Val Val Lys Thr Phe His Glu Thr Leu 

185 190 195 

gac tgc tgt ggc tec age aca ctg act get ttg acc acc tea gtg etc 798 
Asp Cys Cys Gly Ser Ser Thr Leu Thr Ala Leu Thr Thr Ser Val Leu 

55 200 205 210 

aag aac aat ttg tgt ccc teg ggc age aac ate ate age aac etc ttc 846 
Lys Asn Asn Leu Cys Pro Ser Gly Ser Asn He He Ser Asn Leu Phe 

215 220 225 

aag gag gac tgc cac cag aag ate gat gac etc ttc tec ggg aag ctg 894 

60 Lys Glu Asp Cys His Gin Lys He Asp Asp Leu Phe Ser Gly Lys Leu 
230 235 240 245 

tac etc ate ggc att get gcc ate gtg gtc get gtg ate atg ate ttc 942 
Tyr Leu He Gly He Ala Ala He Val Val Ala Val He Met He Phe 

163 
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10 



15 



250 255 260 

gag atg ate ctg age atg gtg ctg tgc tgt ggc ate egg aac age tec 
Glu Met lie Leu Ser Met Val Leu Cys Cys Gly lie Arg Asn Ser Ser 

265 270 275 

gtg tac tgaggccccg cagctctggc cacagggacc tctgcagtgc cccctaagtg 
Val Tyr 

acccggacac ttccgagggg gccatcaccg cctgtgtata taacgtttcc ggtattactc 
tgctacacgt agecttttta cttttggggt tttgtttttg ttctgaactt tcctgttacc 
ttttcagggc tgaegtcaca tgtaggtggc gtgtatgagt ggagaeggge ctgggtcttg 
gggactggag ggcaggggtc cttctgccct ggggtcccag ggtgctctgc ctgctcagcc 
aggcctctcc tgggagecac tcgcccagag actcagcttg gecaacttgg ggggctgtgt 
ccacccagcc cgcccgtcct gtgggctgca cagctcacct tgttccctcc tgccccggtt 
egagagcega gtctgtgggc actctctgcc ttcatgcacc tgtcctttct aacacgtcgc 
cttcaactgt aatcacaaca tcctgactcc gtcatttaat aaagaaggaa eatcaggcat 
gcaaaaaaaa aaaaaaaaa 



990 



1046 

1106 
1166 
1226 
1286 
1346 
1406 
1466 
1526 
1545 



<210> 156 

<211> 1068 

<212> DNA 

20 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 11. . 



529 



25 

<400> 156 

gaagcacgga atg tgt etc ctg ctg ggg gee acg ggc gtc ggg aag acg 
Met Cys Leu Leu Leu Gly Ala Thr Gly Val Gly Lys Thr 
15 10 
30 ctg ctg gtg aaa egg ctg cag gag gtg age tec egg gat ggg aaa ggc 
Leu Leu Val Lys Arg Leu Gin Glu Val Ser Ser Arg Asp Gly Lys Gly 

15 20 25 

gac ctg ggg gag ccg ccc ccg aca egg ccc acg gtg ggc ace aat ctt 
Asp Leu Gly Glu Pro Pro Pro Thr Arg Pro Thr Val Gly Thr Asn Leu 
35 30 35 40 45 

act gac ate gtg gca cag aga aag ate ace ate egg gag ctt ggg ggg 
Thr Asp lie Val Ala Gin Arg Lys lie Thr lie Arg Glu Leu Gly Gly 

50 55 60 

tgc atg ggc ccc ate tgg tec agt tac tat gga aac tgc cgt tct etc 
40 Cys Met Gly Pro lie Trp Ser Ser Tyr Tyr Gly Asn Cys Arg Ser Leu 
65 70 75 

ctg ttt gtg atg gac gee tct gac ccc ace cag etc tct gca tec tgt 
Leu Phe Val Met Asp Ala Ser Asp Pro Thr Gin Leu Ser Ala Ser Cys 
80 85 90 

45 gtg cag etc tta ggt etc ctt tct gca gaa caa ctt gca gaa gca teg 
Val Gin Leu Leu Gly Leu Leu Ser Ala Glu Gin Leu Ala Glu Ala Ser 

95 100 105 

gtg ctg ata etc ttc aat aaa ate gac eta ccc tgt tac atg tec acg 
Val Leu He Leu Phe Asn Lys He Asp Leu Pro Cys Tyr Met Ser Thr 
50 110 115 120 125 

gag gag atg aag tea tta ate agg ctt cca gac ate att get tgt gee 
Glu Glu Met Lys Ser Leu He Arg Leu Pro Asp He He Ala Cys Ala 

130 135 140 

aag cag aac ate acc acg gca gaa ate age gee cgt gaa ggc act ggc 
55 Lys Gin Asn He Thr Thr Ala Glu He Ser Ala Arg Glu Gly Thr Gly 
145 150 155 

tta gca ggg gtg ctg gec tgg etc cag gee acc cac aga gee aac gat 
Leu Ala Gly Val Leu Ala Trp Leu Gin Ala Thr His Arg Ala Asn Asp 
160 165 170 

60 tgactgcacg geagaggege agctggcctg agctggggag aggtggcaga gggcagtatg 
getttgetge caatagtttc ttctcacagg ggcagaataa cccaaagtaa ccctacatga 
tggggctctg tgctgggatg caatgatgtg taaactgagg catgtggaga tggaagttga 
catctggcct ctgaaaaaag tgtccccagg ggctaggcat ggtggctcac acctgtaatc 



49 



97 



145 



193 



241 



289 



337 



385 



433 



481 



529 



589 
649 
709 
769 
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ccagcacttt 
tgaccaacat 
gcatgcctgt 
ggtggaggtt 
aactccgtct 



gagaggccga ggcgggtgta tcacctgagg tcgggagttc gagactagcc 829 

ggagaaaccc tgtctctact aaaaatacaa aattagctgg gtgtgctggt 889 

aatctcagct acttgggagg ctgagacagg agaatccctt gaacctggga 94 9 

gcagtgagtc gagatcatgc cattgcactg cacctgggca acaagagtga 1009 

taaaaaatat aagaaataaa aaaataaaaa cctaaaaaaa aaaaaaaaa 1068 



<210> 157 

<211> 1097 

<212> DNA 

10 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 135. 



.749 



15 

<400> 157 

aacgaaacgg taaccagccc tgggaagccc gcaagaggcc tcagcggtgg ccgtccgagc 
gccgagaggt gagggtgccc ccgcctcacc tgcagagggg ccgttccggg ctcgaacccg 
gcaccttccg gaaa atg gcg get gec agg ccc age ctg ggc mga gtc etc 
20 Met Ala Ala Ala Arg Pro Ser Leu Gly Arg Val Leu 

15 10 
cca gga tec tct gtc ctg ttc ctg tgt gac atg cag gag aag ttc cgc 
Pro Gly Ser Ser Val Leu Phe Leu Cys Asp Met Gin Glu Lys Phe Arg 
15 20 25 

25 cac aac ate gee tac ttc cca cag ate gtc tea gtg get gee cgc atg 
His Asn lie Ala Tyr Phe Pro Gin lie Val Ser Val Ala Ala Arg Met 

30 35 40 

etc aag gtg gee egg ctg ctt gag gtg cca gtc atg ctg acg gag cag 
Leu Lys Val Ala Arg Leu Leu Glu Val Pro Val Met Leu Thr Glu Gin 
30 45 50 55 60 

tac cca caa ggc ctg ggc ccc acg gtg ccc gag ctg ggg act gag ggc 
Tyr Pro Gin Gly Leu Gly Pro Thr Val Pro Glu Leu Gly Thr Glu Gly 

65 70 75 

ctt egg ccg ctg gee aag ace tgc ttc age atg gtg cct gee ctg cag 
35 Leu Arg Pro Leu Ala Lys Thr Cys Phe Ser Met Val Pro Ala Leu Gin 
80 85 90 

cag gag ctg gac agt egg ccc cag ctg cgc tct gtg ctg etc tgt ggc 
Gin Glu Leu Asp Ser Arg Pro Gin Leu Arg Ser Val Leu Leu Cys Gly 
95 100 105 

40 att gag gca cag gee tgc ate ttg aac acg acc ctg gac etc eta gac 
lie Glu Ala Gin Ala Cys lie Leu Asn Thr Thr Leu Asp Leu Leu Asp 

110 115 120 

egg ggg ctg cag gtc cat gtg gtg gtg gac gee tgc tec tea cgc age 
Arg Gly Leu Gin Val His Val Val Val Asp Ala Cys Ser Ser Arg Ser 
45 125 130 135 140 

cag gtg gac cgt ctg gtg get ctg gee cgc atg aga cag agt ggt gee 
Gin Val Asp Arg Leu Val Ala Leu Ala Arg Met Arg Gin Ser Gly Ala 

145 150 155 

ttc etc tec acc age gaa ggg etc att ctg cag ctt gtg ggc gat gee 
50 Phe Leu Ser Thr Ser Glu Gly Leu lie Leu Gin Leu Val Gly Asp Ala 
160 165 170 

gtc cac ccc cag ttc aag gag ate cag aaa etc ate aag gag ccc gee 
Val His Pro Gin Phe Lys Glu lie Gin Lys Leu lie Lys Glu Pro Ala 
175 180 185 

55 cca gac age gga ctg ctg ggc etc ttc caa ggc cag aac tec etc etc 
Pro Asp Ser Gly Leu Leu Gly Leu Phe Gin Gly Gin Asn Ser Leu Leu 

190 195 200 

cac tgaactccaa ccctgccttg agggaagacc accctcctgt cacccggacc 
His 
60 205 

tcagtggaag cccgttcccc ccatccctgg atcccaagag tggtgcgatc caccaggagt 
gccgccccct tgtggggggg ggcagggtgc tgccttccca ttggacagct gctcccggaa 
atgcaaatga gactcctgga aactgggtgg gaattggctg agecaagatg gaggegggge 



60 
120 
170 



218 



266 



314 



362 



410 



458 



506 



554 



602 



650 



698 



746 



799 



859 
919 
979 
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tcggccccgg gccacttcac ggggcgggaa ggggagggga agaagagtct cagactgtgg 103 9 
gacacggact cgcagaataa acatatatgt ggctgtggac caaaaaaaaa aaaaaaaa 1097 

<210> 158 
5 <211> 894 
<212> DNA 

<213> Homo sapiens 

<220> 
10 <221> CDS 

<222> 98. .637 

<400> 158 

ctttggggcc gaagtgggcg tgcggctcgc gctgttcgcg gccttcctgg tgacggagct 60 
15 gctccccccg ttccagagac tcatccagcc ggaggag atg tgg etc tac egg aac 115 

Met Trp Leu Tyr Arg Asn 
1 5 

ccc tac gtg gag gcg gag tat ttc ccc ace aag ccg atg ttt gtt att 163 
Pro Tyr Val Glu Ala Glu Tyr Phe Pro Thr Lys Pro Met Phe Val lie 

20 10 15 20 

gca ttt etc tct cca ctg tct ctg ate ttc ctg gec aaa ttt etc aag 211 
Ala Phe Leu Ser Pro Leu Ser Leu lie Phe Leu Ala Lys Phe Leu Lys 

25 30 35 

aag gca gac aca aga gac age aga caa gee tgc ctg get gee age ctt 259 

25 Lys Ala Asp Thr Arg Asp Ser Arg Gin Ala Cys Leu Ala Ala Ser Leu 
40 45 50 

gee ctg get ctg aat ggc gtc ttt ace aac aca ata aaa ctg ate gta 307 
Ala Leu Ala Leu Asn Gly Val Phe Thr Asn Thr lie Lys Leu lie Val 
55 60 65 70 

30 ggg agg cca cgc cca gat ttc ttc tac cgc tgc ttc cct gat ggg eta 355 
Gly Arg Pro Arg Pro Asp Phe Phe Tyr Arg Cys Phe Pro Asp Gly Leu 

75 80 85 

gee cat tct gac ttg atg tgt aca ggg gat aag gac gtg gtg aat gag 403 
Ala His Ser Asp Leu Met Cys Thr Gly Asp Lys Asp Val Val Asn Glu 

35 90 95 100 

ggc cga aag age ttc ccc agt gga cat tct tec ttt gca ttt get ggt 451 
Gly Arg Lys Ser Phe Pro Ser Gly His Ser Ser Phe Ala Phe Ala Gly 

105 110 115 

ctg gee ttt gcg tec ttc tac ctg gca ggg aag tta cac tgc ttc aca 499 

40 Leu Ala Phe Ala Ser Phe Tyr Leu Ala Gly Lys Leu His Cys Phe Thr 
120 125 130 

cca caa ggc cgt ggg aaa tct tgg agg ttc tgt gee ttt ctg tea cct 547 
Pro Gin Gly Arg Gly Lys Ser Trp Arg Phe Cys Ala Phe Leu Ser Pro 
135 140 145 150 

45 eta ctt ttt gca get gtg att gca ctg tec cgc aca tgt gac tac aag 595 
Leu Leu Phe Ala Ala Val lie Ala Leu Ser Arg Thr Cys Asp Tyr Lys 

155 160 165 

cat cac tgg caa gat ctg etc aaa tgc ace aac act gee aag 637 
His His Trp Gin Asp Leu Leu Lys Cys Thr Asn Thr Ala Lys 

50 170 175 180 

tgactaaggt agaaaagaaa aatgacaggt ategtcatet gaaggacaga tgaatctttt 697 
tctgcccctt cttcacaatg gaatataagg aacaattatg ggatgtcatc agaatggatg 757 
ccataggacc tacagctccc tttctcttta ttgtgattat actttaaata tgacattgtc 817 
ttttatgtgt atgttcctat attttcaatg tatctttttc cttcagtaaa cctgatattc 877 

55 aaaaaaaaaa aaaaaaa 894 

<210> 159 
<211> 703 
<212> DNA 
60 <213> Homo sapiens 



<220> 
<221> CDS 



166 
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\ 



4 
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<222> 221. .670 



PCT/IBOO/01938 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



<400> 159 

aaggaagcgc cgccccttcc tacggctacg ggaaggatcg 
agagcacaga caaaagataa aaagcaagat ttgagagagg 
atctgcttag ttctacaaag tggagtttct gggcatcatt 
tgctgtgaag ctcaagaaga aatagctctg cacaggaacg 



ctg ctt ctt tat eta aga tgg tgt ttc aac tta 
Leu Leu Leu Tyr Leu Arg Trp Cys Phe Asn Leu 

10 15 
aaa tat gag cca aaa gac tct etc ggc cct gaa 
Lys Tyr Glu Pro Lys Asp Ser Leu Gly Pro Glu 

25 * 30 
gat get gee aga ggc ccc ctg tta tec tec ctg 
Asp Ala Ala Arg Gly Pro Leu Leu Ser Ser Leu 

40 45 
ctg atg tea act gee agt gtg tgc ate tec tta 
Leu Met Ser Thr Ala Ser Val Cys He Ser Leu 

55 60 
99t 99 c a 99 a 9t cct tgc tac tea cag aaa tgg 
Gly Gly Arg Ser Pro Cys Tyr Ser Gin Lys Trp 
70 75 80 

gaa aaa tta acc tec ctt ggc cag cag tec tea 
Glu Lys Leu Thr Ser Leu Gly Gin Gin Ser Ser 

90 95 
gac act gat gtg cag gtg tct cct atg ctg gtt 
Asp Thr Asp Val Gin Val Ser Pro Met Leu Val 

105 110 
age age age ctt ctt gac aac ata ccc ttc act 
Ser Ser Ser Leu Leu Asp Asn He Pro Phe Thr 

120 125 
cat etc tct tct tea etc ccc tac eta tgt etc 
His Leu Ser Ser Ser Leu Pro Tyr Leu Cys Leu 

135 140 
aaa taaacagctt gcacttgaaa aaaaaaaaaa aaa 
Lys 
150 

<210> 160 
<211> 849 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 165 . .674 

<400> 160 

aaaactgagg cctgggagca ggaacctgta ggcagegett 
tgcaagcgcg cgtgggaggc gggggctctg ggcggaacaa 
atgtttcccg ggaagaactg ggataaagga agggtcccag 



aac cct gaa gag aac atg aag cag cag gat tea 
Asn Pro Glu Glu Asn Met Lys Gin Gin Asp Ser 
5 10 15 

ccc cag age cca gga ggc aac ate tgc cac ctg 
Pro Gin Ser Pro Gly Gly Asn He Cys His Leu 

25 30 
acc cgc tgc etc ate acc ttc gca gat tec aag 
Thr Arg Cys Leu He Thr Phe Ala Asp Ser Lys 



tccagtggct gaggctggac 
ttcctggatc aactggctca 
cttcatttct gtacacaaag 
atg tgc act gee eta 
Met Cys Thr Ala Leu 
1 5 
aaa ctt gtg aat gtg 
Lys Leu Val Asn Val 
20 

atg acc ttt gta gca 
Met Thr Phe val Ala 
35 

gac tct cca get aac 
Asp Ser Pro Ala Asn 
50 

cct gag ggc tgt tct 
Pro Glu Gly Cys Ser 
65 

cca cca gaa gtg cca 
Pro Pro Glu Val Pro 
85 

acc age tec etc act 
Thr Ser Ser Leu Thr 
100 

get gga gtc aac cac 
Ala Gly Val Asn His 
115 

ggc tgc ctt cct ttc 
Gly Cys Leu Pro Phe 
130 

eta ggc tct ccc ttc 
Leu Gly Ser Pro Phe 
145 



gagggtagcg ggatagcagc 
aaatcacagg atgtcagagg 
cacc atg gag gac ccg 

Met Glu Asp Pro 

1 

ccc aag gag aga agt 
Pro Lys Glu Arg Ser 
20 

999 gec ccg aag tgc 
Gly Ala Pro Lys Cys 
35 

ttc cag gag cgt cac 
Phe Gin Glu Arg His 



60 
120 
180 
235 



283 



331 



379 



427 



475 



523 



571 



619 



667 



703 



60 
120 
176 



224 



272 



320 



167 
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40 45 50 

atg aag egg gag cac cca gcg gac ttc gtg gec cag aag ctg cag ggg 
Met Lys Arg Glu His Pro Ala Asp Phe Val Ala Gin Lys Leu Gin Gly 
55 60 65 

5 gtc etc ttc ate tgc ttc ace tgc gee cgc tec ttc ccc tec tec aaa 
Val Leu Phe lie Cys Phe Thr Cys Ala Arg Ser Phe Pro Ser Ser Lys 

70 75 80 

gec eta ate ace cac cag cgc age cac ggt cca gee gee aag ccc acc 
Ala Leu lie Thr His Gin Arg Ser His Gly Pro Ala Ala Lys Pro Thr 
10 85 90 95 100 

ctg ccg gtt gca acc act act gee cag ccc acc ttc cct tgt cct gac 
Leu Pro Val Ala Thr Thr Thr Ala Gin Pro Thr Phe Pro Cys Pro Asp 

105 110 115 

tgt ggc aag acc ttt ggg cag get gtt tct ctg agg egg cac cgc cag 
15 Cys Gly Lys Thr Phe Gly Gin Ala Val Ser Leu Arg Arg His Arg Gin 
12 0 12 5 13 0 

atg cat gag gtc cgt gec cct cct ggc acc ttc gee tgc aca gag tgc 
Met His Glu Val Arg Ala Pro Pro Gly Thr Phe Ala Cys Thr Glu Cys 
135 140 145 

20 ggt cag gac ttt get cag gaa gca ggg ctg cat caa cac tac att egg 
Gly Gin Asp Phe Ala Gin Glu Ala Gly Leu His Gin His Tyr lie Arg 

150 155 160 

cat gee egg ggg gag etc tgagtgcagc ttaagectet ccacggtgac 
His Ala Arg Gly Glu Leu 
25 165 170 

gggtggctct gtggctggta ggactcaccc atgatatggg gtgeaggaac tetgggggee 
ctgaaggatt tgcttccctc ccctgggaag gcagagggct cttaataaag aggacccaga 
agattetcaa aaaaaaaaaa aaaaa 



368 



416 



464 



512 



560 



608 



656 



704 



764 
824 
849 



30 <210> 161 

<211> 846 

<212> DNA 

<213> Homo sapiens 



35 



<220> 
<221> CDS 
<222> 165. 



. 671 



40 



45 



50 



<400> 161 

aaaactgagg cctgggagca ggaacctgta ggcagegett 

tgcaagcgcg cgtgggaggc gggggctctg ggcggaacaa 

atgtttcccg ggaagaactg ggataaagga agggtcccag 



aac 
Asn 
5 

ccc 
Pro 

ccg 
Pro 



gaa 
55 Glu 

cct 
Pro 

60 cct 
Pro 
85 
ccg 



cct 
Pro 

cag 
Gin 

ctg 
Leu 

gcg 
Ala 

ctt 

Leu 

70 

aat 

Asn 

gtt 



gaa gag 
Glu Glu 

ccc agg 
Pro Arg 

cct cat 
Pro His 

40 
gga gca 
Gly Ala 
55 

cat ctg 
His Leu 



aac 
Asn 

agg 

Arg 

25 

cac 

His 



atg 

Met 

10 

caa 

Gin 

ctt 
Leu 



ccc age 
Pro Ser 

ctt cac 
Leu His 



cac cca cca gcg 
His Pro Pro Ala 
90 

gca acc act act 



aag 
Lys 

cat 
His 

cgc 
Arg 

gga 
Gly 

ctg 

Leu 

75 

cag 

Gin 

gee 



cag 
Gin 

ctg 
Leu 

aga 
Arg 

ctt 

Leu 

60 

cgc 

Arg 



cag 
Gin 

cca 
Pro 

ttc 

Phe 

45 

cgt 

Arg 



gat tea 
Asp Ser 
15 

cct ggg 
Pro Gly 
30 

caa gtt 
Gin Val 

ggc cca 
Gly Pro 



ccg etc ctt 
Pro Leu Leu 



cac ggt cca gec 
His Gly Pro Ala 
95 

cag ccc acc ttc 



gagggtagcg ggatagcagc 
aaatcacagg atgtcagagg 
cacc atg gag gac ccg 

Met Glu Asp Pro 

1 

ccc aag gag aga agt 
Pro Lys Glu Arg Ser 
20 

ggc ccc gaa gtg cac 
Gly Pro Glu Val His 
35 

cca gga gcg tea cat 
Pro Gly Ala Ser His 
50 

gaa get gca ggg ggt 
Glu Ala Ala Gly Gly 
65 

ccc etc etc caa age 
Pro Leu Leu Gin Ser 
80 

gec aag ccc acc ctg 
Ala Lys Pro Thr Leu 
100 

cct tgt cct gac tgt 



60 
120 
176 



224 



272 



320 



368 



416 



464 



512 



168 
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Pro Val Ala Thr Thr Thr Ala Gin Pro Thr Phe Pro Cys Pro Asp Cys 

105 110 115 

ggc aag acc ttt ggg cag get gtt tct ctg agg egg cac cgc cag atg 560 
Gly Lys Thr Phe Gly Gin Ala Val Ser Leu Arg Arg His Arg Gin Met 
5 * 120 125 130 

cat gag gtc cgt gee cct cct ggc acc ttc gee tgc aca gag tgc ggt 608 
His Glu Val Arg Ala Pro Pro Gly Thr Phe Ala Cys Thr Glu Cys Gly 

135 140 145 

cag gac ttt get cag gaa gca ggg ctg cat caa cac tac att egg cat 656 
10 Gin Asp Phe Ala Gin Glu Ala Gly Leu His Gin His Tyr lie Arg His 
150 155 160 

gee egg ggg gag etc tgagtgcagc ttaagectet ccacggtgac gggtggctct 711 
Ala Arg Gly Glu Leu 
165 

15 gtggctggta ggactcaccc atgatatggg gtgeaggaac tetgggggee ctgaaggatt 771 
tgcttccctc ccctgggaag gcagagggct cttaataaag aggacccaga agattctcaa 831 
aaaaaaaaaa aaaaa 84 6 

<210> 162 
20 <211> 1176 
<212> DNA 
<213> Homo sapiens 

<220> 
25 <221> CDS 

<222> 28 . . 1128 

<400> 162 

ctttcctgcc tctgattccg ggctgtc atg gcg acc ccc aac aat ctg acc ccc 54 
30 Met Ala Thr Pro Asn Asn Leu Thr Pro 

1 5 

acc aac tgc age tgg tgg ccc ate tec gcg ctg gag age gat gcg gee 102 
Thr Asn Cys Ser Trp Trp Pro lie Ser Ala Leu Glu Ser Asp Ala Ala 
. 10 15 20 25 

35 aag cca gcg gag gee ccc gac get ccc gag gcg gee age ccc gee cat 150 
Lys Pro Ala Glu Ala Pro Asp Ala Pro Glu Ala Ala Ser Pro Ala His 

30 35 40 

tgg ccc agg gag age ctg gtt ctg tac cac tgg acc cag tec ttc age 198 
Trp Pro Arg Glu Ser Leu Val Leu Tyr His Trp Thr Gin Ser Phe Ser 
40 45 50 55 

teg cag aag gtg egg ctg gtg ate gee gag aag ggc ctg gtg tgc gag 24 6 

Ser Gin Lys Val Arg Leu Val lie Ala Glu Lys Gly Leu Val Cys Glu 

60 65 70 

gag egg gac gtg age ctg cca cag age gag cac aag gag ccc tgg ttc 294 
45 Glu Arg Asp Val Ser Leu Pro Gin Ser Glu His Lys Glu Pro Trp Phe 
75 80 85 

atg egg etc aac ctg ggc gag gag gtg ccc gtc ate ate cac cgc gac 342 
Met Arg Leu Asn Leu Gly Glu Glu Val Pro Val lie lie His Arg Asp 
90 95 100 105 

50 aac ate ate agt gac tat gac cag ate att gac tat gtg gag cgc acc 390 
Asn lie lie Ser Asp Tyr Asp Gin lie lie Asp Tyr Val Glu Arg Thr 

110 115 120 

ttc aca gga gag cac gtg gtg gee ctg atg ccc gag gtg ggc age ctg 438 
Phe Thr Gly Glu His Val Val Ala Leu Met Pro Glu Val Gly Ser Leu 
55 125 130 135 

cag cac gca egg gtg ctg cag tac egg gag ctg ctg gac gca ctg ccc 4 86 

Gin His Ala Arg Val Leu Gin Tyr Arg Glu Leu Leu Asp Ala Leu Pro 

140 145 150 

atg gat gee tac acg cat ggc tgc ate ctg cat etc gag etc acc acc 534 
60 Met Asp Ala Tyr Thr His Gly Cys lie Leu His Leu Glu Leu Thr Thr 
155 160 165 

gac tec atg ate ccc aag tac gee acg gee gag ate cgc aga cat tta 582 
Asp Ser Met lie Pro Lys Tyr Ala Thr Ala Glu lie Arg Arg His Leu 

169 
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170 










175 










180 










185 






gcc 


aat 


gcc 


acc 


acg 


gac 


etc 


atg 


aaa 


ctg 


gac 


cat 


gaa 


gag 


gag 


ccc 


630 




Ala 


Asn 


Ala 


Thr 


Thr 


Asp 


Leu 


Met 


Lys 


Leu 


Asp 


His 


Glu 


Glu 


Glu 


Pro 














190 










195 










200 






5 


cag 


etc 


tec 


gag 


ccc 


tac 


ctt 


tct 


aaa 


caa 


aag 


aag 


etc 


atg 


gcc 


aag 


678 




Gin 


Leu 


Ser 


Glu 


Pro 


Tyr 


Leu 


Ser 


Lys 


Gin 


Lys 


Lys 


Leu 


Met 


Ala 


Lys 












205 










210 










215 










ate 


ttg 


gag 


cat 


gat 


gat 


gtg 


age 


tac 


ctg 


aag 


aag 


ate 


etc 


ggg 


gaa 


726 




He 


Leu 


Glu 


His 


Asp 


Asp 


Val 


Ser 


Tyr 


Leu 


Lys 


Lys 


He 


Leu 


Gly 


Glu 




10 






220 










225 










230 












ctg 


gcc 


atg 


gtg 


ctg 


gac 


cag 


att 


gag 


gcg 


gag 


ctg 


gag 


aag 


agg 


aag 


774 




Leu 


Ala 


Met 


Val 


Leu 


Asp 


Gin 


He 


Glu 


Ala 


Glu 


Leu 


Glu 


Lys 


Arg 


Lys 








235 










240 










245 














ctg 


gag 


aac 


gag 


ggg 


cag 


aaa 


tgc 


gag 


ctg 


tgg 


etc 


tgt 


ggc 


tgt 


gcc 


822 


15 


Leu 


Glu 


Asn 


Glu 


Gly 


Gin 


Lys 


Cys 


Glu 


Leu 


Trp 


Leu 


Cys 


Gly 


Cys 


Ala 






250 










255 










260 










265 






ttc 


ace 


etc 


get 


gat 


gtc 


etc 


ctg 


gga 


gcc 


acc 


ctg 


cac 


cgc 


etc 


aag 


870 




Phe 


Thr 


Leu 


Ala 


Asp 


Val 


Leu 


Leu 


Gly 


Ala 


Thr 


Leu 


His 


Arg 


Leu 


Lys 














270 










275 










280 






20 


ttc 


ctg 


gg a 


ctg 


tec 


aag 


aaa 


tac 


tgg 


gaa 


gat 


ggc 


age 


egg 


ccc 


aac 


918 




Phe 


Leu 


Gly 


Leu 


Ser 


Lys 


Lys 


Tyr 


Trp 


Glu 


Asp 


Gly 


Ser 


Arg 


Pro 


Asn 












285 










290 










295 










ctg 


cag 


tec 


ttc 


ttt 


gag 


agg 


gtc 


cag 


aga 


cgc 


ttt 


gcc 


ttc 


egg 


aaa 


966 




Leu 


Gin 


Ser 


Phe 


Phe 


Glu 


Arg 


Val 


Gin 


Arg 


Arg 


Phe 


Ala 


Phe 


Arg 


Lys 




25 






300 










305 










310 












gtc 


ctg 


ggt 


gac 


ate 


cac 


acc 


acc 


ctg 


ctg 


teg 


gcc 


gtc 


ate 


ccc 


aat 


1014 




Val 


Leu 


Gly 


Asp 


He 


His 


Thr 


Thr 


Leu 


Leu 


Ser 


Ala 


Val 


He 


Pro 


Asn 








315 










320 










325 














get 


ttc 


egg 


ctg 


gtc 


aag 


agg 


aaa 


ccc 


cca 


tec 


ttc 


ttc 


ggg 


gcg 


tec 


1062 


30 


Ala 


Phe 


Arg 


Leu 


val 


Lys 


Arg 


Lys 


Pro 


Pro 


Ser 


Phe 


Phe 


Gly 


Ala 


Ser 






330 










335 










340 










345 






ttc 


etc 


atg 


ggc 


tec 


ctg 


ggt 


ggg 


atg 


ggc 


tac 


ttt 


gcc 


tac 


tgg 


tac 


1110 




Phe 


Leu 


Met 


Gly 


Ser 


Leu 


Gly 


Gly 


Met 


Gly 


Tyr 


Phe 


Ala 


Tyr 


Trp 


Tyr 





350 355 360 



35 etc aag aaa aaa tac ate tagggecagg cctggggctt ggtgtctgac 1158 
Leu Lys Lys Lys Tyr He 
365 

aaaaaamaaa aaaaaaaa 1176 

40 <210> 163 
<211> 1084 
<212> DNA 

<213> Homo sapiens 

45 <220> 

<221> CDS 
<222> 135 . . 194 

<400> 163 

50 aacgaaaegg taaccagccc tgggaagece geaagaggee tcagcggtgg ccgtccgagc 60 

gecgagaggt gagggtgece ccgcctcacc tgcagagggg ccgttccggg ctcgaacccg 120 

gcaccttccg gaaa atg gcg get gcc agg ccc age ctg ggc cga gtc etc 170 

Met Ala Ala Ala Arg Pro Ser Leu Gly Arg Val Leu 
15 10 

55 cca gga tec tct cct gtt cct gtg tgacatgcag gagaagttcc gccacaacat 224 
Pro Gly Ser Ser Pro Val Pro Val 

15 20 

cgcctacttc ccacagatcg tctcagtggc tgcccgcatg ctcaaggtgg cccggctgct 284 

tgaggtgcca gtcatgctga eggagcagta cccacaaggc ctgggcccca cggtgcccga 344 

60 gctggggact gagggectte ggccgctggc caagacctgc ttcagcatgg tgcctgccct 404 

gcagcaggag ctggacagtc ggccccagct gcgctctgtg ctgctctgtg gcattgaggc 464 

acaggcctgc atcttgaaca cgaccctgga cctcctagac egggggctge aggtccatgt 524 

ggtggtggac gcctgctcct cacgcagcca ggtggaccgg ctggtggctc tggcccgcat 584 

170 
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10 



15 



gagacagagt ggtgccttcc tctccaccag cgaagggctc attctgcagc ttgtgggcga 644 

tgccgtccac ccccagttca aggagatcca gaaactcatc aaggagcccg ccccagacag 704 

cggactgctg ggcctcttcc aaggccagaa ctccctcctc cactgaactc caaccctgcc 764 

ttgagggaag accaccctcc tgtcacccgg acctcagtgg aagcccgttc cccccatccc 824 

tggatcccaa gagtggtgcg atccaccagg agtgccgccc ccttgggggg ggcagggtgc 884 

tgccttccca ttggacagct gctcccggaa atgcaaatga gactcctgga aactgggtgg 944 

gaattggctg agccaagatg gaggcggggc tcggccccgg gccacttcac ggggcgggaa 1004 

ggggagggga agaagagtct cagactgtgg gacacggact cgcagaataa acatatatgt 1064 
ggcaaaaaaa aaaaaaaaaa 

<210> 164 
<211> 1793 
<212> DNA 
<213> Homo sapiens 



1084 



<220> 
<221> CDS 
<222> 173 . - 847 

20 <400> 164 

gsmggrggcc attacctaga acatcstaat cgaarratta tttgaaaaac cactgggttc 60 
cgagttcatt actacaggaa aaactgttct cttctgtggc acagagaacc ctgcttcaaa 120 
gcagaagtag cagttccgga gtccagctgg ctaaaactca tcccagagga ta atg gca 178 

Met Ala 

25 1 

acc cat gcc tta gaa ate get ggg ctg ttt ctt ggt ggt gtt gga atg 226 
Thr His Ala Leu Glu lie Ala Gly Leu Phe Leu Gly Gly Val Gly Met 

5 10 15 

gtg ggc aca gtg get gtc act gtc atg cct cag tgg aga gtg teg gcc 274 
30 Val Gly Thr Val Ala Val Thr Val Met Pro Gin Trp Arg Val Ser Ala 
20 25 30 

ttc att gaa aac aac ate gtg gtt ttt gaa aac ttc tgg gaa gga ctg 322 
Phe lie Glu Asn Asn lie Val Val Phe Glu Asn Phe Trp Glu Gly Leu 
35 40 45 50 

35 tgg atg aat tgc gtg agg cag get aac ate agg atg cag tgc aaa ate 370 
Trp Met Asn Cys Val Arg Gin Ala Asn lie Arg Met Gin Cys Lys lie 

55 60 65 

tat gat tec ctg ctg get ctt tct ccg gac eta cag gca gcc aga gga 418 
Tyr Asp Ser Leu Leu Ala Leu Ser Pro Asp Leu Gin Ala Ala Arg Gly 
40 70 75 80 

ctg atg tgt get get tec gtg atg tec ttc ttg get ttc atg atg gcc 466 
Leu Met Cys Ala Ala Ser Val Met Ser Phe Leu Ala Phe Met Met Ala 

85 90 95 

ate ctt ggc atg aaa tgc acc agg tgc acg ggg gac aat gag aag gtg 514 
45 lie Leu Gly Met Lys Cys Thr Arg Cys Thr Gly Asp Asn Glu Lys Val 
100 105 110 

aag get cac att ctg ctg acg get gga ate ate ttc ate ate acg ggc 562 
Lys Ala His He Leu Leu Thr Ala Gly He lie Phe He He Thr Gly 
115 120 125 130 

50 atg gtg gtg etc ate cct gtg age tgg gtt gcc aat gcc ate ate aga 610 
Met Val Val Leu He Pro Val Ser Trp Val Ala Asn Ala He lie Arg 

135 140 145 

gat ttc tat aac tea ata gtg aat gtt gcc caa aaa cgt gag ctt gga 658 
Asp Phe Tyr Asn Ser He Val Asn Val Ala Gin Lys Arg Glu Leu Gly 
55 150 155 160 

gaa get etc tac tta gga tgg acc acg gca ctg gtg ctg att gtt gga 706 
Glu Ala Leu Tyr Leu Gly Trp Thr Thr Ala Leu Val Leu He Val Gly 

. 165 170 175 

gga get ctg ttc tgc tgc gtt ttt tgt tgc aac gaa aag age agt age 754 
60 Gly Ala Leu Phe Cys Cys Val Phe Cys Cys Asn Glu Lys Ser Ser Ser 
180 185 190 

tac aga tac teg ata cct tec cat cgc aca acc caa aaa agt tat cac 802 
Tyr Arg Tyr Ser He Pro Ser His Arg Thr Thr Gin Lys Ser Tyr His 

171 
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195 

acc gga aag 
Thr Gly Lys 

5 tagttgtgta 
actttctcaa 
attacaggaa 
attaaaccca 
aagcatctac 

10 actactgtaa 
tcacatagag 
ataaatagaa 
tattaattgt 
tgaaggcttt 

15 gcctaggagt 
gtgtattaaa 
tgcttttcca 
ttttaggaaa 
agaaatcata 

20 atcaatataa 



200 205 
aag tea ccg age gtc tac tec aga 
Lys Ser Pro Ser Val Tyr Ser Arg 
215 220 
tgttttttta actttactat aaagccatgc 
aatggacccc aaagaaactt tgatttactg 
ctgtgcatca gctatttatg attctataag 
atgetttgat tgttctagaa agtattgtaa 
tctttttatc atttacttca aaatgacatt 
tttctccacg acatagcatt atgtacatag 
acatgettat atggttttat ttaaaatgaa 
ctcaactatt gcttttcagg gaaatcatgg 
ttaaaaacag cttatggatt aatgtcctcc 
aatcagcatt gtaaaggaaa ttgaatggct 
tagaaatcct aacttcttta tcctcttctc 
ttaacatttt taaaaagcag atattttgtc 
gggctatact cagaagaaag ataaaagtgt 
gtgaaaatat ttttgttttt gtatttgaag 
tatgtatgta tatattttaa taagtatttg 
ataaaagagc agaaaagtaa aaaaaaaaaa 



agt cag tat 
Ser Gin Tyr 

aaatgacaaa 
ttcttaactg 
ctatttcagc 
tttgttttct 
gctaaagact 
atgagtgtaa 
atgccagtcc 
atagggttga 
atttataatg 
ttctgatatg 
ccagaggctt 
aaggggcttt 
gatctaagaa 
aagaatgatg 
agtacagact 
aaaaaa 



210 

gtg 
Val 
225 
aatctatatt 
cctaatctta 
agaatgagat 
aaggtggttc 
gcattattct 
catttatatc 
attacactga 
agaaggttac 
aagattaaaa 
ctgtttttta 
tttttttctt 
gcattcaaac 
aaagtgatgg 
cattttgaca 
ttgaggtttc 



847 



907 
967 
1027 
1087 
1147 
1207 
1267 
1327 
1387 
1447 
1507 
1567 
1627 
1687 
1747 
1793 



<210> 165 

<211> 1849 

<212> DNA 

25 <213> Homo sapiens 



<220> 
<221> 
<222> 



CDS 

8 . . 1141 



30 



35 



40 



45 



50 



55 



60 



<220> 

<221> misc_f eature 
<222> 1707 

<223> n=a, g, c or t 
<400> 165 

cgttgcc atg gat cct ggg 
Met Asp Pro Gly 
1 

tac cag gat ttc tat gca 
Tyr Gin Asp Phe Tyr Ala 
15 20 
gaa tgg att gat gac aaa 
Glu Trp lie Asp Asp Lys 
35 

aaa aag aat gaa att ctt 
Lys Lys Asn Glu lie Leu 
50 

aag gaa aac aag ggc tta 
Lys Glu Asn Lys Gly Leu 
65 

99a gga ttt tea gac agg 
Gly Gly Phe Ser Asp Arg 
80 

acc aga ttg ctg gtt acc 
Thr Arg Leu Leu Val Thr 



95 



100 



tgg cag gtt gca gag gac 
Trp Gin Val Ala Glu Asp 
115 

get gtg cat gag aaa gag 
Ala Val His Glu Lys Glu 
130 



gac 


gac 


tgg 


ctg 


gtg 


gaa 


tec 


ttg 


cgc 


ttg 


49 


Asp 


Asp 


Trp 


Leu 


Val 


Glu 


Ser 


Leu 


Arg 


Leu 




5 










10 












ttc 


gac 


ctg 


tea 


gga 


gee 


act 


cga 


gtc 


ctt 


97 


Phe 


Asp 


Leu 


Ser 


Gly 


Ala 


Thr 


Arg 


Val 


Leu 












25 










30 




gga 


gtc 


ttt 


gtt 


get 


ggc 


tat 


gaa 


age 


ctg 


145 


Gly 


val 


Phe 


Val 


Ala 


Gly 


Tyr 


Glu 


Ser 


Leu 










40 










45 






cat 


ctg 


aaa 


tta 


cct 


etc 


aga 


ctt 


tct 


gta 


193 


His 


Leu 


Lys 


Leu 


Pro 


Leu 


Arg 


Leu 


Ser 


Val 








55 










60 








ttc 


cca 


gaa 


aga 


gat 


ttc 


aaa 


gtg 


cgc 


cat 


241 


Phe 


Pro 


Glu 


Arg 


Asp 


Phe 


Lys 


Val 


Arg 


His 






70 










75 










tct 


ate 


ttt 


gat 


eta 


aag 


cat 


gtg 


cca 


cat 


289 


Ser 


He 


Phe 


Asp 


Leu 


Lys 


His 


Val 


Pro 


His 




85 










90 












agt 


ggc 


ctt 


cca 


ggt 


tgt 


tat 


ctg 


cag 


gtg 


337 


Ser 


Gly 


Leu 


Pro 


Gly 


Cys 


Tyr 


Leu 


Gin 


Val 












105 










110 




agt 


gat 


gtc 


att 


aaa 


get 


gtc 


age 


acc 


att 


385 


Ser 


Asp 


Val 


He 


Lys 


Ala 


Val 


Ser 


Thr 


He 










120 










125 






gag 


agt 


etc 


tgg 


cct 


agg 


gtg 


gee 


gtc 


ttc 


433 


Glu 


Ser 


Leu 


Trp 


Pro 


Arg 


Val 


Ala 


Val 


Phe 





135 



140 
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tec aca ttg gca ccc gga gtc etc cat ggg gcg agg 
Ser Thr Leu Ala Pro Gly Val Leu His Gly Ala Arg 

145 150 
cag gtc gtt gat ctg gag tec egg aag acc acg tac 
5 Gin Val Val Asp Leu Glu Ser Arg Lys Thr Thr Tyr 
160 165 170 

agt gac agt gag gag ctg agt age ctg cag gtc eta 
Ser Asp Ser Glu Glu Leu Ser Ser Leu Gin Val Leu 
175 180 185 

10 ttt gee ttc tgc tgt get teg ggc egg ctg ggg ctt 
Phe Ala Phe Cys Cys Ala Ser Gly Arg Leu Gly Leu 

195 200 
cag aag tgg gca ccg ttg gag aat cgc age cct ggc 
Gin Lys Trp Ala Pro Leu Glu Asn Arg Ser Pro Gly 

15 210 215 

gga gag aga tgg tgt get gaa gtt ggg age tgg ggc 
Gly Glu Arg Trp Cys Ala Glu Val Gly Ser Trp Gly 

225 230 
ccc age att gee age ctt age tea gat ggg cgt ctt 

20 Pro Ser lie Ala Ser Leu Ser Ser Asp Gly Arg Leu 
240 245 250 

ccc egg gat etc tgc cat cct gtg age tea gtc cag 
Pro Arg Asp Leu Cys His Pro Val Ser Ser Val Gin 
255 260 265 

25 gta cct age cct gac cca gag ctg ctg cga gtg act 
Val Pro Ser Pro Asp Pro Glu Leu Leu Arg Val Thr 

275 280 
ctg aag aat tgc ttg gee ate tea ggt ttt gat ggt 
Leu Lys Asn Cys Leu Ala lie Ser Gly Phe Asp Gly 

30 2 90 2 95 

tat gat gee aca tct tgg gat gga aca egg age caa 
Tyr Asp Ala Thr Ser Trp Asp Gly Thr Arg Ser Gin 

305 310 
age caa gta gaa cct etc ttc act cac aga ggt cac 

35 Ser Gin Val Glu Pro Leu Phe Thr His Arg Gly His 
320 325 330 

gga aat ggg atg gac cct get cct ttg gtc acc acc 
Gly Asn Gly Met Asp Pro Ala Pro Leu Val Thr Thr 
335 340 345 

40 ccc tgc aga cca agg act ttg tta tea gca aca aat 
Pro Cys Arg Pro Arg Thr Leu Leu Ser Ala Thr Asn 

355 360 
cat gtg tgg gac tgg gtg gac ctt tgt gee ccc cgc 
His Val Trp Asp Trp Val Asp Leu Cys Ala Pro Arg 

45 370 375 

atctttccat ctaggcctct agaaagggga ggagctgctg tagtagcaag 
taggactcaa gtgactacca gtccctgtta ccagctgtgt ggccttgggc 
gegtcactta gcctcagttt ccttatctgt aaaatgagga tagtaagaac 
tgatattgcg aaggttagaa gaaacgcatg gcataattac ttggtagcta 

50 tgggagtgtg aaatggtagc gttttgtccc tgtcttcaca ctatcatagg 
agagctaaca aatataaaca tgctttgtga atttttttaa agaaaaaaat 
caataaacat gaaaaaatcc cagccctagt agcaattaag gaaatagcaa 
ctgctcctct tgagggggtc tcatgggaac acaggtgeae tttcccacac 
aggtgactag gttcaagaga catttgettt tggtggcccc acaaacattt 

55 geccatagtg aatatntaaa gtgtgctgga catggtggct catgcctgta 
tttcagaggc tgaggtgggc agattgettg agctgaggag tttgagacca 
catagcaaga tcccttcccc aaaaaaaaaa aaaaaaaa 



etc cga agt ctg 481 
Leu Arg Ser Leu 
155 

acc tea gat gtc 529 
Thr Ser Asp Val 



gat 


gcg 


gac 


acc 


577 


Asp 


Ala 


Asp 


Thr 










190 




gtt 


gac 


acc 


coo 


625 


Val 


Asp 


Thr 


Arg 








205 






cct 


ggg 


tct 


ggt 


673 


Pro 


Gly 


Ser 


Gly 






220 








cag 


ggc 


cct 


ggg 


721 


Gin 


Gly 


Pro 


Gly 




235 










tgt 


ctt 


ctt 


gac 


769 


Cys 


Leu 


Leu 


Asp 




tgc 


cca 


gta 


tec 


817 


Cys 


Pro 


Val 


Ser 










270 






gee 


cca 


gg c 


ODD 


Trp 


Ala 


Pro 


Gly 








285 






aca 


gtc 


cag 


gtc 


913 


Thr 


Val 


Gin 


Val 






300 








gat 


gga 


aca 


egg 


961 


Asp 


Gly 


Thr 


Arg 




315 










ate 


ttc 


eta 


gat 


1009 


He 


Phe 


Leu 


Asp 




cac 


acc 


tgg 


cat 


1057 


His 


Thr 


Trp 


His 





350 

gat gec tct ctg 
Asp Ala Ser Leu 

365 
tgacaccagc 



ggtgctgatg 
aagtctgeca 
tacctegtag 
ttgttagatc 
gagaateaaa 
gtaggggggc 
aacaggattt 
ttgtcccccc 
ccttttgagg 
atcccagcac 
gcctgggcaa 



1105 



1151 



1211 
1271 
1331 
1391 
1451 
1511 
1571 
1631 
1691 
1751 
1811 
1849 



<210> 166 

60 <211> 1748 

<212> DNA 
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<220> 
<221> CDS 
<222> 136. .264 



5 <400> 166 

attatttgaa aaaccactgg gttccgagtt cattactaca ggaaaaactt tctcttctgt 
ggcacagaga accctgcttc aaagcagaag tagcagttcc ggagtccagc tggctaaaac 
tcatcccaga ggata atg gca acc cat gcc tta gaa ate get ggg ctg ttt 
Met Ala Thr His Ala Leu Glu lie Ala Gly Leu Phe 

10 1 5 10 

ctt ggt ggt gtt gga atg gtg ggc aca gtg get gtc act gtc atg cct 
Leu Gly Gly Val Gly Met Val Gly Thr Val Ala Val Thr Val Met Pro 

15 20 25 

cag tgg aga gtg teg gcc ttc att gaa aac aac ate gtg gtt ttt 

15 Gin Trp Arg Val Ser Ala Phe He Glu Asn Asn He Val Val Phe 
30 35 40 

taaaacttct gggaaggact gtggatgaat tgcgtgaggc aggctaacat caggatgeag 
tgeaaaatet atgattccct gctggctctt tctccggacc tacaggcagc cagaggactg 
atgtgtgctg ettcegtgat gtccttcttg gctttcatga tggccatcct tggcatgaaa 

20 tgcaccaggt geaeggggga caatgagaag gtgaaggctc acattctget gacggctgga 
atcatcttca teatcaeggg catggtggtg ctcatccctg tgagctgggt tgccaatgcc 
atcatcagag atttctataa ctcaatagtg aatgttgccc aaaaacgtga gcttggagaa 
gctctctact taggatggac cacggcactg gtgctgattg ttggaggagc tctgttctgc 
tgcgtttttt gttgeaaega aaagagcagt agctacagat actcgatacc ttcccatcgc 

25 acaacccaaa aaagttatca caceggaaag aagtcaccga gcgtctactc cagaagtcag 
tatgtgtagt tgtgtatgtt tttttaactt tactataaag ecatgeaaat gacaaaaatc 
tatattactt tctcaaaatg gaccccaaag aaactttgat ttactgttct taactgecta 
atcttaatta caggaactgt gcatcagcta tttatgattc tataagctat ttcagcagaa 
tgagatatta aaccgaatgc tttgattgtt ctagaaagta tagtaatttg ttttctaagg 

30 tggktcaagc atctactctt tttatcattt acttcaaaat gaeattgeta aagactgeat 
tattttacta ctgtaatttc tccacgacat agcattatct acatagatga gtgtaacatt 
tatatctcac atagagacat gcttatatgg ttkcatttaa aatgaaatgc cagtccatta 
cactgaataa atagaactca actattgett ttcagggaaa tcatggatag ggttgaagaa 
ggttactatt aattgtttaa aaacagctta gggattaatg tcctccattt ataatgaaga 

35 ttaaaatgaa ggctttaatc agcattgtaa aggaaattga atggctttct gatatgetgt 
tttttagect aggagttaga aatcctaact tctttatcct cttctcccag aggctttttt 
tttcttgtgt attaaattaa catttttaaa aagcagatat tttgtcaagg ggctttgeat 
teaaactget tttccagggc tatactcaga agaaagataa aagtgtgatc taagaaaaag 
tgatggtttt aggaaagtga aaatattttt gtttttgtat ttgaagaaga atgatgeatt 

40 ttgacaagaa atcatatatg tatggatata ttttaataag tatttgagta cagactttga 
ggtttcatca atataaataa aagagcaaaa aaaaaaaaaa aaaa 



60 
120 
171 



219 



264 



324 
384 
444 
504 
564 
624 
684 
744 
804 
864 
924 
984 
1044 
1104 
1164 
1224 
1284 
1344 
1404 
1464 
1524 
1584 
1644 
1704 
1748 



<210> 167 
<211> 1275 
45 <212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
50 <222> 14 . . 1048 

<400> 167 

agaggttggg aag atg gcg tgg cga ggc tgg gcg cag aga ggc tgg ggc 4 9 

Met Ala Trp Arg Gly Trp Ala Gin Arg Gly Trp Gly 
55 1 5 10 

tgc ggc cag gcg tgg ggt gcg teg gtg ggc ggc cgc age tgc gag gag 97 
Cys Gly Gin Ala Trp Gly Ala Ser Val Gly Gly Arg Ser Cys Glu Glu 

15 20 25 

etc act gcg gtc eta acc ccg ccg cag etc etc gga cgc agg ttt aac 14 5 

60 Leu Thr Ala Val Leu Thr Pro Pro Gin Leu Leu Gly Arg Arg Phe Asn 
30 35 40 

ttc ttt att caa caa aaa tgc gga ttc aga aaa gca ccc agg aag gtt 193 
Phe Phe He Gin Gin Lys Cys Gly Phe Arg Lys Ala Pro Arg Lys Val 
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45 










50 










55 










60 






gaa 


cct 


cga 


aga 


tea 


gae 


cca 


ggg 


aca 


agt 


ggt 


gaa 


gca 


tac 


aag 


aga 


241 




Glu 


Pro 


Arg 


Arg 


Ser 


Asp 


Pro 


Gly 


Thr 


Ser Gly Glu Ala 


Tyr 


Lys 


Arg 














65 










70 










75 






5 


agt 


get 


ttg 


att 


cct 


cct 


gtg 


gaa 


gaa 


aca 


gtc 


ttt 


tat 


cct 


tct 


ccc 


289 




Ser 


Ala 


Leu 


He 


Pro 


Pro 


Val 


Glu 


Glu 


Thr 


Val 


Phe 


Tyr 


Pro 


Ser 


Pro 












80 










85 










90 










tat 


cct 


ata 


agg 


agt 


etc 


ata 


aaa 


cct 


tta 


ttt 


ttt 


act 


gtt 


ggg 


ttt 


337 




Tyr 


Pro 


He 


Arg 


Ser 


Leu 


He 


Lys 


Pro 


Leu 


Phe 


Phe 


Thr 


Val 


Gly 


Phe 




10 




95 










100 










105 












aca 


ggc 


tgt 


gca 


ttt 


gga 


tea 


get 


get 


att 


tgg 


caa 


tat 


gaa 


tea 


ctg 


385 




Thr 


Gly Cys 


Ala 


Phe 


Gly 


Ser 


Ala 


Ala 


He 


Trp 


Gin 


Tyr 


Glu 


Ser 


Leu 








110 










115 










120 














aaa 


tec 


aaa 


gtc 


cag 


agt 


tat 


ttt 


gat 


ggt 


ata 


aaa 


get 


gat 


tgg 


ttg 


433 


15 


Lys 


Ser 




Val 


Gin 


Ser 


Tyr 


Phe 


Asp 


Gly 


He 


Lys 


Ala 


Asp 


Trp 


Leu 






125 










130 










135 










140 






qat 


age 


a t" a 
a La 


aga 


cca 


caa 


aaa 


gaa 


gga 


gae 


ttc 


aga 


aag 


gag 


att 


aac 


481 




Asp 


C y- 


He 


Arg 


Pro 


Gin 


Lys 


Glu 


Gly 


Asp 


Phe 


Arg 


Lys 


Glu 


He 


Asn 














145 










150 










155 






20 


aag 


fc 99 


tgg 


aat 


aac 


eta 


agt 


gat 


ggc 


cag 


egg 


act 


gtg 


aca 


ggt 


att 


529 




Lys 


irp 


irp 


Asn 


Asn 


Leu 


Ser 


Asp 


Gly 


Gin 


Arg 


Thr 


Val 


Thr 


Gly 


He 












160 










165 










170 










ata 


get 


gca 


aat 


gtc 


ctt 


gta 


ttc 


tgt 


tta 


tgg 


aga 


gta 


cct 


tct 


ctg 


577 




lie 


HI ck. 


7V I 


Asn 


Val 


Leu 


Val 


Phe 


Cys 


Leu 


Trp 


Arg 


val 


Pro 


Ser 


Leu 




25 






ITC 
J. / 3 










180 










185 












cap 


egg 


aca 


atq 


ate 


aga 


tat 


ttc 


aca 


tcq 


aat 


cca 


gee 


tea 


aag 


gtc 


625 




Gin 


Arg 




Met 


He 


Arg 


Tyr 


Phe 


Thr 


Ser 


Asn 


Pro 


Ala 


Ser 


Lys 


Val 








XZf\J 










195 










200 














ctt 


C 9 t 


1- t- 


cca 


atq 


ttg 


ctg 


tea 


aca 


ttc 


aqt 


cat 


ttc 


tec 


tta 


ttt 


673 


30 


Leu 


w s 


OCX. 


Pro 


Met 


Leu 


Leu 


Ser 


Thr 


Phe 


Ser 


His 


Phe 


Ser 


Leu 


Phe 






205 










210 










215 










220 






cac 


atg 


gca 


qca 


aat 


atg 


tat 


gtt 


ttg 
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w 3 3 


aqc 


ttc 


tct 


tec 


age 


ata 


721 




His 


Met 


Ala 


Ala 


Asn 


Met 


Tyr 


Val 


Leu 


Trp 


Ser 


Phe 


Ser 


Ser 


Ser 


He 














225 










230 










235 






35 


Qtq 


a a c 


att 


Ctq 


ggt 


caa 


gag 


cag 


ttc 


atg 


gca 


gtg 


tac 


eta 


tct 


gca 


769 




Val 


Asn 


lie 


Leu 


Gly Gin 


Glu 


Gin 


Phe 


Met 


Ala 


val 


Tyr 


Leu 


Ser 


Ala 












240 










245 










250 










QQt 


att 


att 


tec 


aat 


ttt 


gtc 


agt 


tac 


gtg 


ggt 


aaa 


gtt 


gec 


aca 


gga 


817 




Gly 


Val 


He 


Ser 


Asn 


Phe 


val 


Ser 


Tyr 


Val 


Gly 


Lys 


Val 


Ala 


Thr 


Gly 




40 






255 










260 










265 
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tat 


gga 


cca 


tea 


ctt 


ggt 
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gee 


ctg 


aaa 


gee 


att 


ate 


gee 


atg 


865 




Arg 


Tyr 


Gly 
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Ser 


Leu 


Gly 


Ala 


Ala 


Leu 


Lys 


Ala 


He 


He 


Ala 


Met 








270 










275 










280 














aat 


aca 


gca 


gga 


atg 


ate 


ctg 


gga 


tgg 


aaa 


ttt 


ttt 


gat 


cat 


gcg 


gca 


913 


45 


Asp 


Thr 


Ala 


Gly 


Met 


He 


Leu 


Gly 


Trp 


Lys 


Phe 


Phe 


Asp 


His 


Ala 


Ala 






285 










290 










295 










300 






cat 


ctt 


ggg 


gga 


get 


ctt 


ttt 


gga 


ata 


tgg 


tat 


gtt 


act 


tac 


ggt 


cat 


961 




His 


Leu 


Gly 


Gly 


Ala 


Leu 


Phe 


Gly 


He 


Trp 


Tyr 


Val 


Thr 


Tyr 


Gly 


His 














305 










310 










315 






50 


gaa 


ctg 


att 


tgg 


aag 


aac 


a gg 


gag 


ccg 


eta 


gtg 


aaa 


ate 


tgg 


cat 


gaa 


1009 




Glu 


Leu 


He 


Trp 


Lys 


Asn 


Arg 


Glu 


Pro 


Leu 


Val 


Lys 


He 


Trp 


His 


Glu 












320 










325 










330 










ata 


agg 


act 


aat 


ggc 


ccc 


aaa 


aaa 


gga 


ggt 


ggc 


tct 


aag 


taaaactggg 


1058 




He 


Arg 


Thr 


Asn 


Gly 


Pro 


Lys 


Lys 


Gly Gly Gly 


Ser 


Lys 










55 






335 










340 










345 











attggacagt agtggtgcat ctggtccttg ccgcctgaga gccccaggag acateggcta 1118 
gagtgaccat ggctatgetc ccgtctggaa gatgecagea tctggcctcc cacttttttc 1178 
agctgtgtcc cccagtccgt gtctttttag aatgtgaatg atgataaagt tgtgaaataa 1238 
aggtttctat etagtttgea aaaaaaaaaa aaaaaaa 1275 

60 
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<211> 1023 
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<213> Homo sapiens 

<220> 
<221> CDS 
5 <222> 70. .777 



<400> 168 

aataggtccg gttccggggg cgcgtggctg cagcggggcc cgcgtggtgc ctcctgaggc 
ggcccccgg atg aag aga tct ggg aac ccg gga gcc gag gta acg aac age 

10 Met Lys Arg Ser Gly Asn Pro Gly Ala Glu Val Thr Asn Ser 

15 10 
teg gtg gca ggg cct gac tgc tgc gga ggc etc ggc aat att gat ttt 
Ser Val Ala Gly Pro Asp Cys Cys Gly Gly Leu Gly Asn lie Asp Phe 
15 20 25 30 

15 aga cag gca gac ttc tgc gtt atg acc egg ctg ctg ggc tac gtg gac 
Arg Gin Ala Asp Phe Cys Val Met Thr Arg Leu Leu Gly Tyr Val Asp 

35 40 45 

ccc ctg gat ccc age ttt gtg get gcc gtc ate acc ate acc ttc aat 
Pro Leu Asp Pro Ser Phe Val Ala Ala Val He Thr He Thr Phe Asn 

20 50 55 60 

ccg etc tac tgg aat gtg gtt gca cga tgg gaa cac aag acc cgc aag 
Pro Leu Tyr Trp Asn Val Val Ala Arg Trp Glu His Lys Thr Arg Lys 

65 70 75 

ctg age agg gcc ttc gga tec ccc tac ctg gcc tgc tac tct eta age 

25 Leu Ser Arg Ala Phe Gly Ser Pro Tyr Leu Ala Cys Tyr Ser Leu Ser 
80 85 90 

ate acc ate ctg etc ctg aac ttc ctg cgc teg cac tgc ttc acg cag 
He Thr He Leu Leu Leu Asn Phe Leu Arg Ser His Cys Phe Thr Gin 
95 100 105 110 

30 gcc atg ctg age cag ccc agg atg gag age ctg gac acc ccc gcg gcc 
Ala Met Leu Ser Gin Pro Arg Met Glu Ser Leu Asp Thr Pro Ala Ala 

115 120 125 

tac age ctg gtc etc gca etc ctg gga ctg ggc gtc gtg etc gtg etc 
Tyr Ser Leu Val Leu Ala Leu Leu Gly Leu Gly Val Val Leu Val Leu 

35 130 135 140 

tec age ttc ttt gca ctg ggg ttc get gga act ttc eta ggt gat tac 
Ser Ser Phe Phe Ala Leu Gly Phe Ala Gly Thr Phe Leu Gly Asp Tyr 

145 150 155 

ttc ggg ate etc aag gag gcg aga gtg acc gtg ttc ccc ttc aac ate 

40 Phe Gly He Leu Lys Glu Ala Arg Val Thr Val Phe Pro Phe Asn He 
160 165 170 

ctg gac aac ccc atg tac tgg gga age aca gcc aac tac ctg ggc tgg 
Leu Asp Asn Pro Met Tyr Trp Gly Ser Thr Ala Asn Tyr Leu Gly Trp 
175 180 185 190 

45 gcc ate atg cac gcc age ccc acg ggc ctg etc ctg acg gtg ctg gtg 
Ala He Met His Ala Ser Pro Thr Gly Leu Leu Leu Thr Val Leu Val 

195 200 205 

gcc etc acc tac ata gtg get etc eta tac gaa gag ccc ttc acc get 
Ala Leu Thr Tyr He Val Ala Leu Leu Tyr Glu Glu Pro Phe Thr Ala 

50 210 215 220 

gag ate tac egg cag aaa gcc tec ggg tec cac aag agg age 
Glu He Tyr Arg Gin Lys Ala Ser Gly Ser His Lys Arg Ser 

225 230 235 

tgattgagct gcaacagctt tgetgaagge ctggccagcc tcctggcctg ccccaagtgg 

55 caggccctgc geagggegag aatggtgcct getgetcagg gctcgccccc ggcgtgggct 
gccccagtgc cttggaacct getgecttgg ggaccctgga cgtgccgaca tatggecatt 
gagctccaac ccacacattc ccattcacca ataaaggcac cctgacctca aaaaaaaaaa 
aaaaaa 



60 
111 



159 



207 



255 



303 



351 



399 



447 



495 



543 



591 



639 



687 



735 



777 



837 
897 
957 
1017 
1023 
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<220> 
<221> CDS 
<222> 38. .400 



<400> 169 

aacaattcat gaagttgaag aaaagacact gtcagaa atg aac aca gaa gcg gag 55 

Met Asn Thr Glu Ala Glu 
i 5 

10 caa cag ctt etc cat cac gec aga aat ggc aat get gaa gaa gta aga 103 
Gin Gin Leu Leu His His Ala Arg Asn Gly Asn Ala Glu Glu Val Arg 

10 15 20 

caa eta tta gag ace atg gcg agt aat gaa gtg att get gac att aat 151 
Gin Leu Leu Glu Thr Met Ala Ser Asn Glu Val lie Ala Asp lie Asn 
15 25 30 35 

tgc aaa gga aga agt aag tct aac ttg ggc tgg aca ccc eta cat ctg 199 
Cys Lys Gly Arg Ser Lys Ser Asn Leu Gly Trp Thr Pro Leu His Leu 

40 45 50 

gca tgc tat ttt gga cac aga caa gtg gtc cag gat ctg ttg aag get 247 
20 Ala Cys Tyr Phe Gly His Arg Gin Val Val Gin Asp Leu Leu Lys Ala 
55 60 65 70 

ggt gca gaa gtg aat gtg ttg aat gac atg gga gac acg ccg ctt cat 295 
Gly Ala Glu Val Asn Val Leu Asn Asp Met Gly Asp Thr Pro Leu His 
75 80 85 

25 cga get gee ttt aca gga cga aag gtg aaa ate att eta tgt tea atg 343 
Arg Ala Ala Phe Thr Gly Arg Lys Val Lys lie lie Leu Cys Ser Met 

90 95 100 

ttt gta agt gag gta ttt gga gga gta gtt ace att gtt ttc tct gtt 391 
Phe Val Ser Glu Val Phe Gly Gly Val Val Thr lie Val Phe Ser Val 
30 105 110 115 

ata ace ate tgaccagcaa ccgaagaaag ccacacaaaa aaatgtatac 440 
lie Thr lie 
120 

accagcactt tgggtcaaaa ggccacagga tcttttgagt ctgacagtga ggtccagtac 500 
35 taaggtcatg gagaccccca ctctgtagca tccctgtgag gagatcattc cgtttctgct 560 
tgtgtactcc agcaatgggg aactcctgat tattcttttt ttttaaaaaa aaatagcttc 620 
attgaggtat aacttacatt gcataaactt cacctgtgat attgtgaaat atatatttgg 680 
tctttgacct tgtacactaa agatgtacaa aaagatgact ggcaacccct ggcttcagga 740 
tgggggctgg tcaccagaaa gaccaaggca ggactagggg gttgggactt tcagccgaac 800 
40 tttgeaaect ccagggaggg tagaggggct gaaggggaaa tggctegcta atggccagtg 860 
gtttcatcaa teatgectat ttaatggaac ctccataaaa acctgaaagg acagggttct 920 
aggagctcct gggtagctga acacgtggag gttcttgaat gatcacaccc agggagggca 980 
tgggtgctct gtgcccttcc tccatgcctt gctttatgta tctcttcatc tgtatccttt 1040 
gtaataaagc agtaaacatg ttttcctgaa aaaaaaaaaa aaaaa 1085 

45 

<210> 170 
<211> 776 
<212> DNA 
<213> Homo sapiens 



50 



<220> 
<221> CDS 
<222> 63 . . 



572 



55 <400> 170 

atatgtcatc aggccccccg cctgggaggt gtgctgccag agattttgee tcttcaaggt 60 
ga atg egg ctt caa ggg get ate ttt gtg etc ctg ccc cac ctg ggg 107 
Met Arg Leu Gin Gly Ala lie Phe Val Leu Leu Pro His Leu Gly 
15 10 15 

60 ccc ate ctg gtc tgg ctg ttc act cgt gat cac atg tct ggt tgg tgt 155 
Pro lie Leu Val Trp Leu Phe Thr Arg Asp His Met Ser Gly Trp Cys 

20 25 30 

gag ggc ccg agg atg ctg tec tgg tgc cca ttc tac aaa gtc tta ttg 203 

177 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



Glu Gly Pro Arg Met Leu Ser Trp Cys Pro Phe 

35 4 0 

ctt gta cag aca gcc ate tac tct gtc gtg ggc 
Leu Val Gin Thr Ala lie Tyr Ser Val Val Gly 
5 50 55 

gtg tgg aag gac ctg gga ggg ggc ttg ggg tgg 
Val Trp Lys Asp Leu Gly Gly Gly Leu Gly Trp 

65 70 
ctt ggc etc tat get gtt cag etc acc ate age 

10 Leu Gly Leu Tyr Ala Val Gin Leu Thr lie Ser 
80 85 90 

etc ttt ttc aca gtc cac aac cct ggt ctg gcc 
Leu Phe Phe Thr Val His Asn Pro Gly Leu Ala 
100 105 

15 ctg ctg tat ggg ctg gtg gtg age aca gca ctg 
Leu Leu Tyr Gly Leu Val Val Ser Thr Ala Leu 

115 120 
aac aaa ctg get gcc ctg tta ctg ctg ccc tac 
Asn Lys Leu Ala Ala Leu Leu Leu Leu Pro Tyr 

20 130 135 

gtg act tea gcc etc acc tac cac ctg tgg agg 
Val Thr Ser Ala Leu Thr Tyr His Leu Trp Arg 

145 150 
gtg cac cag cct cag ccc acg gag aag agt gac 

25 Val His Gin Pro Gin Pro Thr Glu Lys Ser Asp 
160 165 170 

gaggagggac geccagggtg gggaggaaga gtctgeaage 
caccccaatg ggaccaccct cctgggtccc ctggtgccgt 
aatgggaaag ggggggaaac tgattttaca cttaaataat 

30 aaaa 



Tyr Lys 



tat 
Tyr 

ccc 

Pro 

75 

tgg 

Trp 



gcc 

Ala 

60 

ctg 

Leu 

act 
Thr 



Val 
45 
tec 
Ser 

gcc 
Ala 

gtc 
val 



ctg ctg cac 
Leu Leu His 

ate tgg cat 
He Trp His 
125 

eta gcc tgg 
Leu Ala Trp 
140 

gac age ctt 
Asp Ser Leu 
155 

tgaggeccta 



Leu Leu 

tac ctg 
Tyr Leu 

ctg cct 
Leu Pro 

ctg gtt 
Leu Val 

95 
ctg ctg 
Leu Leu 
110 

ccc ate 
Pro He 

etc acc 
Leu Thr 

tgt cca 
Cys Pro 

gggcatggga 



agggctgtgg agttagggtt 
ttttccttag aaatcagaga 
aaaatcctat tagcaaaaaa 



251 



299 



347 



395 



443 



491 



539 



592 



652 
712 
772 
776 



<210> 171 

<211> 1219 

<212> DNA 

35 <213> Homo sapiens 



40 



45 



50 



55 



60 



<220> 
<221> CDS 
<222> 160 . . 867 

<400> 171 

gtagttagga gtctggagtc gtgagccgga gtcagaactg cgtctcgcga cccaggcgcg 
ggtttccgga ggacagccaa caagegatge tgccgccgcc gtttcctgat tggttgtggg 
tggctacctc ttcgttctga ttggccgcta gtgagcaag atg ctg age aag ggt 

Met Leu Ser Lys Gly 
1 5 
ctg aag egg aaa egg gag gag gag gag gag aag gaa cct ctg gca gtc 
Leu Lys Arg Lys Arg Glu Glu Glu Glu Glu Lys Glu Pro Leu Ala Val 

10 15 20 

gac tec tgg tgg eta gat cct ggc cac aca gcg gtg gca cag gca ccc 
Asp Ser Trp Trp Leu Asp Pro Gly His Thr Ala Val Ala Gin Ala Pro 

25 30 35 

ccg gcc gtg gcc tct age tec etc ttt gac etc tea gtg etc aag etc 
Pro Ala Val Ala Ser Ser Ser Leu Phe Asp Leu Ser Val Leu Lys Leu 

40 45 50 

cac cac age ctg cag cag agt gag ccg gac ctg egg cac ctg gtg ctg 
His His Ser Leu Gin Gin Ser Glu Pro Asp Leu Arg His Leu Val Leu 

55 60 65 

gtc gtg aac act ctg egg cgc ate cag gcg tec atg gca ccc gcg get 
Val Val Asn Thr Leu Arg Arg He Gin Ala Ser Met Ala Pro Ala Ala 
70 75 80 85 

gcc ctg cca cct gtg cct age cca cct gca gcc ccc agt gtg get gac 
Ala Leu Pro Pro Val Pro Ser Pro Pro Ala Ala Pro Ser Val Ala Asp 



60 
120 
174 



222 



270 



318 



366 



414 



462 



178 
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10 



15 



20 



25 



30 



35 



40 











90 










95 










100 






aac 


tta 


ctg 


gca 


age 


teg 


gac 


get 


gee 


ctt 


tea 


gee 


tec 


atg 


gee 


age 


510 


Asn 


Leu 


Leu 


Ala 


Ser 


Ser 


Asp 


Ala 


Ala 


Leu 


Ser 


Ala 


Ser 


Met 


Ala 


Ser 










105 










110 










115 








etc 


ctg 


gag 


gac 


etc 


age 


cac 


att 


gag 


ggc 


ctg 


agt 


cag 


get 


ccc 


eaa 


558 


Leu 


Leu 


Glu 


Asp 


Leu 


Ser 


His 


He 


Glu 


Gly 


Leu 


Ser 


Gin 


Ala 


Pro 


Gin 








120 










125 










130 










ccc 


ttg 


gca 


gac 


gag 


ggg 


cca 


cca 


ggc 


cgt 


age 


ate 


ggg 


gga 


gca 


gcg 


606 


Pro 


Leu 


Ala 


Asp 


Glu 


Gly 


Pro 


Pro 


Gly 


Arg 


Ser 


He 


Gly 


Gly 


Ala 


Ala 






135 










140 










145 












ccc 


age 


ctg 


ggt 


gee 


ttg 


gac 


ctg 


ctg 


ggc 


cca 


gee 


act 


ggc 


tgt 


eta 


654 


Pro 


Ser 


Leu 


Gly 


Ala 


Leu 


Asp 


Leu 


Leu 


Gly 


Pro 


Ala 


Thr 


Gly 


Cys 


Leu 




150 










155 










160 










165 




ctg 


gac 


gat 


ggg 


ctt 


gag 


ggc 


ctg 


ttt 


gag 


gat 


att 


gac 


ace 


tct 


atg 


702 


Leu 


Asp 


Asp 


Gly 


Leu 


Glu 


Gly 


Leu 


Phe 


Glu 


Asp 


He 


Asp 


Thr 


Ser 


Met 












170 










175 










180 






tat 


gac 


aat 


gaa 


ctt 


tgg 


gca 


cca 


gee 


tct 


gag 


ggc 


etc 


aaa 


cca 


ggc 


750 


Tyr 


Asp 


Asn 


Glu 


Leu 


Trp 


Ala 


Pro 


Ala 


Ser 


Glu 


Gly 


Leu 


Lys 


Pro 


Gly 










185 










190 










195 








cct 


gag 


gat 


ggg 


ccg 


ggc 


aag 


gag 


gaa 


get 


ccg 


gag 


ctg 


gac 


gag 


gee 


798 


Pro 


Glu 


Asp 


Gly 


Pro 


Gly 


Lys 


Glu 


Glu 


Ala 


Pro 


Glu 


Leu 


Asp 


Glu 


Ala 








200 










205 










210 










gaa 


ttg 


gac 


tac 


etc 


atg 


gat 


gtg 


ctg 


gtg 


ggc 


aca 


cag 


gca 


ctg 


gag 


846 


Glu 


Leu 


Asp 


Tyr 


Leu 


Met 


Asp 


Val 


Leu 


Val 


Gly 


Thr 


Gin 


Ala 


Leu 


Glu 






215 










220 










225 












cga 


ccg 


ccg 


ggg 


cca 


ggg 


cgc 


tgagccctcg 1 


tgctggaatg gttgtctggt 


897 


Arg 


Pro 


Pro 


Gly 


Pro 


Gly 


Arg 






















230 










235 

























atctgaactg agectgetgg ctggaccaac tgtcctcgaa aagacacagc tggcttccct 957 

agtacagaga acagggcttg ggccactttg gagagacaga atctagtcct gggcaacttc 1017 

acatccgtcc tcctgtctca gggctggcag ggggagcctg gaattacccc ctagtgatgg 1077 

aatgacaggg tctggtgggg acttaattcc ctggccctgg ggtcatagct tgggctgttc 1137 

cttctctgat aegggaagag accccaatca gatttttcaa attaaageca gtcctgggaa 1197 

atctcaaaaa aaaaaaaaaa aa 1219 



<210> 
<211> 
<212> 



172 

1487 

DNA 



<213> Homo sapiens 



<220> 
<221> CDS 
<222> 68. . 



640 



45 <400> 172 

gacgaaggac tggaaggtgg cggtggtgaa ggtgcaggcc 

ggtgact atg aaa ggc tta tat ttc eaa cag agt 

Met Lys Gly Leu Tyr Phe Gin Gin Ser 
1 5 

50 ata aca ttt gta ttt caa gaa aag gaa gat ctt 

He Thr Phe Val Phe Gin Glu Lys Glu Asp Leu 
15 20 25 

aac ttt gtg aaa ctt caa gtt aaa get tgt get 

Asn Phe Val Lys Leu Gin Val Lys Ala Cys Ala 

55 35 40 

aca aag ctt ctg gca gaa atg aag atg aaa aag 

Thr Lys Leu Leu Ala Glu Met Lys Met Lys Lys 
50 55 

ggg a g a g aa att g ct gg a att g ta tta g at g fct 

60 Gly Arg Glu He Ala Gly He Val Leu Asp Val 
65 70 

ttc ttt caa cca gat gat gaa gta gtt gga att 

Phe Phe Gin Pro Asp Asp Glu Val Val Gly He 

179 



gttggggcgg ctcagaggca 
tec aca gat gaa gaa 
Ser Thr Asp Glu Glu 
10 

cct gtt aca gag gat 
Pro Val Thr Glu Asp 
30 

ctg age cag ata aat 
Leu Ser Gin He Asn 
45 

gat tta ttt cct gtt 
Asp Leu Phe Pro Val 
60 

gga age aag gta tea 
Gly Ser Lys Val Ser 
75 

ttg ccc ctg gac tct 
Leu Pro Leu Asp Ser 



60 
109 



157 



205 



253 



301 



349 
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80 85 90 

gaa gac cct gga ctt tgt gaa gtt gtt aga gta cat gag cat tac ttg 
Glu Asp Pro Gly Leu Cys Glu Val Val Arg Val His Glu His Tyr Leu 
95 100 105 110 

5 gtt cat aaa cca gaa aag gtc aca tgg acg gaa gca gca gga age att 
Val His Lys Pro Glu Lys Val Thr Trp Thr Glu Ala Ala Gly Ser lie 

115 120 125 

egg gat gga gtg cgt gec tat aca get ctg cat tat ctt tct cat etc 
Arg Asp Gly Val Arg Ala Tyr Thr Ala Leu His Tyr Leu Ser His Leu 
10 *" 130 135 140 

tct cct gga aaa tea gtg ctg ata atg gat gga gca agt gca ttt ggt 
Ser Pro Gly Lys Ser Val Leu lie Met Asp Gly Ala Ser Ala Phe Gly 

145 150 155 

aca ata get att cag tta gca cat cat aga gga gee aaa gta ttt caa 
15 Thr lie Ala lie Gin Leu Ala His His Arg Gly Ala Lys Val Phe Gin 
160 165 170 

cag cat gca gee ttg aag ata age agt gee ttg aaa gat tea gac etc 
Gin His Ala Ala Leu Lys lie Ser Ser Ala Leu Lys Asp Ser Asp Leu 
175 180 185 190 

20 cca tageccgagt gattgatgta tctaatggga aagttcatgt tgetgaaage 
Pro 

tgtttggaag aaacaggtgg cctgggagta gatattgtcc tagatgetgg agtgagatta 
tatagtaaag atgatgaacc agctgtaaaa ctacaactac taccacataa acatgatatc 
atcacacttc ttggtgttgg aggecactgg gtaacaacag aagaaaacct tcagttggat 

25 cctccagata gccactgcct tttcctcaag ggagcaacgt tagctttcct gaatgatgaa 
gtttggaatt tgtcaaatgt acaacaggga aaatatcttt gtatcttaaa ggatgtgatg 
gagaagttat caactggtgt tttcagacct cagttggatg aacccattcc actgtatgag 
gcaaaagttt ccatggaagc tgttcagaaa aatcaaggaa gaaaaaagca agttgttcaa 
ttttaatttt cttctttctc agacctcagt eggatgaaca tattccagta tttgaageca 

30 gaattttctt tggaaattgt tgagaaaaac caaggaagat aaaacaagtt gcatttttaa 
geaegtttet ctgetaagae aagatgetea gttgacacat ttgaaaagtg tttgaaaaat 
tcttgtgcaa atgatcaaga taattctata attaacatct taagggaatt tttctaaaaa 
ccttttcatt gtttctatat attttgcccc tgctataaaa ttccttccat gaagaaaact 
getgetttea gcaaaagtca cactactctt gataaaagct gttgeaggee tttgetaage 

35 aaaaaaaaaa aaaaaaa 



397 



445 



493 



541 



589 



637 



690 

750 
810 
870 
930 
990 
1050 
1110 
1170 
1230 
1290 
1350 
1410 
1470 
1487 



<210> 173 

<211> 1915 

<212> DNA 

40 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 132. 



1298 



45 



50 



55 



60 



<400> 173 

aactcccatt tctggtgccg teaegggaca gagcagtegg tgacaggaca gagcagtegg 
tgaegggaca cagtggttgg tgaegggaca gageggtegg tgacagcctc aagggcttca 
gcaccgcgcc c atg gca gag cca gac ccc tct cac cct ctg gag ace cag 
Met Ala Glu Pro Asp Pro Ser His Pro Leu Glu Thr Gin 
15 10 
gca ggg aag gtg cag gag get cag gac tea gat tea gac tct gag gga 
Ala Gly Lys Val Gin Glu Ala Gin Asp Ser Asp Ser Asp Ser Glu Gly 

15 20 25 

gga gee get ggt gga gaa gca gac atg gac ttc ctg egg aac tta ttc 
Gly Ala Ala Gly Gly Glu Ala Asp Met Asp Phe Leu Arg Asn Leu Phe 
30 35 40 45 

tec cag acg etc age ctg ggc age cag aag gag cgt ctg ctg gac gag 
Ser Gin Thr Leu Ser Leu Gly Ser Gin Lys Glu Arg Leu Leu Asp Glu 

50 55 60 

ctg ace ttg gaa ggg gtg gee egg tac atg cag age gaa cgc tgt cgc 
Leu Thr Leu Glu Gly Val Ala Arg Tyr Met Gin Ser Glu Arg Cys Arg 
65 70 75 



60 
120 
170 



218 



266 



314 



362 



180 
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aga gtc ate tgt ttg gtg gga get gga ate tec aca tec gca ggc ate 410 
Arg Val lie Cys Leu Val Gly Ala Gly lie Ser Thr Ser Ala Gly lie 

80 85 90 

ccc gac ttt cgc tct cca tec ace ggc etc tat gac aac eta gag aag 458 
5 Pro Asp Phe Arg Ser Pro Ser Thr Gly Leu Tyr Asp Asn Leu Glu Lys 
95 100 105 

tac cat ctt ccc tac cca gag gec ate ttt gag ate age tat ttc aag 506 
Tyr His Leu Pro Tyr Pro Glu Ala lie Phe Glu lie Ser Tyr Phe Lys 
110 115 120 125 

10 aaa cat ccg gaa ccc ttc ttc gee etc gee aag gaa etc tat cct ggg 554 
Lys His Pro Glu Pro Phe Phe Ala Leu Ala Lys Glu Leu Tyr Pro Gly 

130 135 140 

cag ttc aag cca ace ate tgt cac tac ttc atg cgc ctg ctg aag gac 602 
Gin Phe Lys Pro Thr He Cys His Tyr Phe Met Arg Leu Leu Lys Asp 
15 145 150 155 

aag ggg eta etc ctg cgc tgc tac acg cag aac ata gat ace ctg gag 650 
Lys Gly Leu Leu Leu Arg Cys Tyr Thr Gin Asn He Asp Thr Leu Glu 

160 165 170 

cga ata gee ggg ctg gaa cag gag gac ttg gtg gag gcg cac ggc ace 698 
20 Arg He Ala Gly Leu Glu Gin Glu Asp Leu Val Glu Ala His Gly Thr 
175 180 * 185 

ttc tac aca tea cac tgc gtc age gee age tgc egg cac gaa tac ccg 746 
Phe Tyr Thr Ser His Cys Val Ser Ala Ser Cys Arg His Glu Tyr Pro 
190 195 200 205 

25 eta age tgg atg aaa gag aag ate ttc tct gag gtg acg ccc aag tgt 794 
Leu Ser Trp Met Lys Glu Lys He Phe Ser Glu Val Thr Pro Lys Cys 

210 215 220 

gaa gac tgt cag age ctg gtg aag cct gat ate gtc ttt ttt ggt gag 842 
Glu Asp Cys Gin Ser Leu Val Lys Pro Asp He Val Phe Phe Gly Glu 
30 225 230 235 

age etc cca gcg cgt ttc ttc tec tgt atg cag tea gac ttc ctg aag 890 
Ser Leu Pro Ala Arg Phe Phe Ser Cys Met Gin Ser Asp Phe Leu Lys 

240 245 250 

gtg gac etc etc ctg gtc atg ggt ace tec ttg cag gtg cag ccc ttt 938 
35 Val Asp Leu Leu Leu Val Met Gly Thr Ser Leu Gin Val Gin Pro Phe 
255 260 265 

gee tec etc ate age aag gca ccc etc tec ace cct cgc ctg etc ate 986 
Ala Ser Leu He Ser Lys Ala Pro Leu Ser Thr Pro Arg Leu Leu He 
270 275 280 285 

40 aac aag gag aaa get ggc cag teg gac cct ttc ctg ggg atg att atg 1034 
Asn Lys Glu Lys Ala Gly Gin Ser Asp Pro Phe Leu Gly Met He Met 

290 295 300 

ggc etc gga gga ggc atg gac ttt gac tec aag aag gee tac agg gac 1082 
Gly Leu Gly Gly Gly Met Asp Phe Asp Ser Lys Lys Ala Tyr Arg Asp 
45 305 310 315 

gtg gee tgg ctg ggt gaa tgc gac cag ggc tgc ctg gee ctt get gag 1130 
Val Ala Trp Leu Gly Glu Cys Asp Gin Gly Cys Leu Ala Leu Ala Glu 

320 325 330 

etc ctt gga tgg aag aag gag ctg gag gac ctt gtc egg agg gag cac 1178 
50 Leu Leu Gly Trp Lys Lys Glu Leu Glu Asp Leu Val Arg Arg Glu His 
335 340 , 345 

gee age ata gat gec cag teg ggg gcg ggg gtc ccc aac ccc age act 1226 
Ala Ser He Asp Ala Gin Ser Gly Ala Gly Val Pro Asn Pro Ser Thr 
350 355 360 365 

55 tea get tec ccc aag aag tec ccg cca cct gee aag gac gag gee agg 1274 
Ser Ala Ser Pro Lys Lys Ser Pro Pro Pro Ala Lys Asp Glu Ala Arg 

370 375 380 

aca aca gag agg gag aaa ccc cag tgacagctgc atctcccagg egggatgecg 132 8 
Thr Thr Glu Arg Glu Lys Pro Gin 
60 3 85 

agctcctcag ggacagctga gccccaaccg ggcctggccc cctcttaacc agcagttctt 1388 
gtctggggag ctcagaacat cccccaatct cttacagctc cctccccaaa actggggtcc 1448 
cagcaaccct ggcccccaac cccagcaaat ctctaacacc tcctagaggc caaggcttaa 1508 

181 
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acaggcatct ctaccagccc cactgtctct aaccactcct gggctaagga gtaacctccc 1568 

tcatctctaa ctgcccccac ggggccaggg ctaccccaga acttttaact cttccaggac 1628 

agggagcttc gggcccccac tctgtctcct gcccccgggg gcctgtggct aagtaaacca 1688 

tacctaacct accccagtgt gggtgtgggc ctctgaatct aacccacacc cagcgtaggg 1748 

5 ggagtctgag ccgggagggc tcccgagtct ctgccttcag ctcccaaagt gggtggtggg 1808 

cccccttcac gtgggaccca cttcccatgc tggatgggca gaagacattg cttattggag 1868 

acaaattaaa aacaaaaaca actaacaaag aaaaaaaaaa aaaaaaa 1915 



<210> 174 
10 <211> 1990 
<212> DNA 
<213> Homo sapiens 

<220> 
15 <221> CDS 

<222> 259. .1701 



<400> 174 

ttagaacatc ytaatcaaaa aattttggtg cgagagaaac aataggacgg aaacgccgag 60 

20 gaacccggct gaggcggcag cttcctaggt gacagacagg tacactgtat gctagccctg 120 

tatctgtctg agcagtggaa tgtgccagga aagaaggagc aaccactgac tgatgaacct 180 

ttgccagtct cccttccaag agggatgcca gagccttctg taagctcctc agatgtcact 240 





ggtatctagg caacaggg 


ata 


aqc 


CtQ 


aac 


etc 


cct 


qaq 


gee 


age 


tta 


ctt 


291 














Met 


Ser 


Leu 


Asn 


Leu 


Pro 


Glu 


Ala 


Ser 


Leu 


Leu 




25 












1 








5 










10 










aga 


gca 


tec 


t~ a a 


cca 


gaa 


caa 


gee 


aag 


gag 


cca 


aga 


cga 


gag 


gga 


339 




Ser 


Arg 


Ala 


Ser 
15 




Pro 


Glu 


Gin 


Ala 
20 


Lys 


Glu 


Pro 


Arg 


Arg 
25 


Glu 


Gly 






cac 


acg 


gac 


aaa 


caa 


caq 


aca 


gaa 


gac 


gta 


ctg 


gee 


get 


gga 


etc 


cgc 


387 


30 


His 


Thr 

• 


Asp 
3 0 


Lvs 


Gin 


Gin 


Thr 


Glu 
35 


Asp 


Val 


Leu 


Ala 


Ala 
40 


Gly 


Leu 


Arg 






tgc 


etc 


ccc 


cat 


etc 


ccc 


gee 


ate 


tgc 


gee 


egg 


agg 


atg 


age 


cca 


gee 


435 




Cys 


Leu 
45 


Pro 


His 


Leu 


Pro 


Ala 
50 


He 


Cys 


Ala 


Arg 


Arg 
55 


Met 


Ser 


Pro 


Ala 




35 


ttc 


aqq 


gec 


atg 


gat 


gtg 


9 a 9 


ccc 


cgc 


gca 


aaa 


ggc 


gtc 


ctt 


ctg 


gag 


483 




Phe 


Arg 


Ala 


Met 


Asp 


Val 


Glu 


Pro 


Arg 


Ala 


Lys 


Gly 


Val 


Leu 


Leu 


Glu 






60 










65 










70 










75 






ccc 


ttt 


gtc 


cac 


cag 


gtc 


ggg 


ggg 


cac 


tea 


tgc 


gtg 


etc 


cgc 


ttc 


aat 


531 




Pro 


Phe 


Val 


His 


Gin 


Val 


Gly 


Gly 


His 


Ser 


Cys 


Val 


Leu 


Arg 


Phe 


Asn 




40 










80 










85 










90 








gag 


aca 


acc 


ctg 


tgc 


aag 


ccc 


ctg 


gtc 


cca 


agg 


gaa 


cat 


cag 


ttc 


tac 


579 




Glu 


Thr 


Thr 


Leu 
95 


Cys 


Lys 


Pro 


Leu 


Val 
100 


Pro 


Arg 


Glu 


His 


Gin 
105 


Phe 


Tyr 






gag 


acc 


etc 


cct 


get 


gag 


atg 


cgc 


aaa 


ttc 


act 


CCC 


cag 


tac 


aaa 


ggt 


627 


45 


Glu 


Thr 


Leu 
110 


Pro 


Ala 


Glu 


Met 


Arg 
115 


Lys 


Phe 


Thr 


Pro 


Gin 
120 


Tyr 


Lys 


Gly 






gtg 


gta 


tct 


gtg 


cgc 


ttt 


gaa 


gaa 


gat 


gaa 


gac 


agg 


aac 


ttg 


tgt 


eta 


675 




Val 


Val 
125 


Ser 


Val 


Arg 


Phe 


Glu 
130 


Glu 


Asp 


Glu 


Asp 


Arg 
135 


Asn 


Leu 


Cys 


Leu 




50 


ata 


gca 


tat 


cca 


ttg 


aaa 


ggg 


gac 


cat 


gga 


att 


gtg 


gac 


att 


gta 


gat 


723 




He 


Ala 


Tyr 


Pro 


Leu 


Lys 


Gly 


Asp 


His 


Gly 


He 


Val 


Asp 


He 


Val 


Asp 






140 










145 










150 










155 






aat 


tea 


gac 


tgt 


gaa 


cca 


aaa 


agt 


aag 


etc 


eta 


agg 


tgg 


aca 


aca 


aac 


771 




Asn 


Ser 


Asp 


Cys 


Glu 


Pro 


Lys 


Ser 


Lys 


Leu 


Leu 


Arg 


Trp 


Thr 


Thr 


Asn 




55 










160 










165 










170 








aaa 


aaa 


cat 


cat 


gtc 


tta 


gaa 


aca 


gaa 


aag 


acc 


cct 


aag 


gac 


tgg 


gtg 


819 




Lys 


Lys 


His 


His 
175 


Val 


Leu 


Glu 


Thr 


Glu 
180 


Lys 


Thr 


Pro 


Lys 


Asp 
185 


Trp 


Val 






cgt 


cag 


cac 


cgt 


aaa 


gag 


gag 


aaa 


atg 


aag 


age 


cat 


aag 


tta 


gaa 


gaa 


867 


60 


Arg 


Gin 


His 
190 


Arg 


Lys 


Glu 


Glu 


Lys 
195 


Met 


Lys 


Ser 


His 


Lys 
200 


Leu 


Glu 


Glu 






gaa 


ttt 


gag 


tgg 


eta 


aag 


aaa 


tct 


gaa 


gtc 


ttg 


tac 


tac 


act 


gta 


gag 


915 




Glu 


Phe 


Glu 


Trp 


Leu 


Lys 


Lys 


Ser 


Glu 


Val 


Leu 


Tyr 


Tyr 


Thr 


Val 


Glu 
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205 210 215 

aag aag ggg aat ata agt tec cag ctt aaa cac tat aac cct tgg age 963 
Lys Lys Gly Asn lie Ser Ser Gin Leu Lys His Tyr Asn Pro Trp Ser 
220 225 230 235 

5 atg aaa tgt cac cag caa cag tta cag aga atg aag gag aat gca aag 1011 
Met Lys Cys His Gin Gin Gin Leu Gin Arg Met Lys Glu Asn Ala Lys 

240 245 250 

cat egg aac cag tac aaa ttt ate tta ctg gaa aac ctg act tec cgc 1059 
His Arg Asn Gin Tyr Lys Phe lie Leu Leu Glu Asn Leu Thr Ser Arg 
10 255 260 265 

tat gag gtg cct tgt gtc ctt gac etc aag atg ggc aca cga caa cat 1107 
Tyr Glu Val Pro Cys Val Leu Asp Leu Lys Met Gly Thr Arg Gin His 

270 275 280 

ggt gat gat get tea gag gag aag gca gee aac cag ate cga aaa tgt 1155 
15 Gly Asp Asp Ala Ser Glu Glu Lys Ala Ala Asn Gin lie Arg Lys Cys 
285 290 295 

cag cag age aca tct gca gtc att ggt gtg cgt gtg tgt ggc atg cag 1203 
Gin Gin Ser Thr Ser Ala Val lie Gly Val Arg Val Cys Gly Met Gin 
300 305 310 315 

20 gtg tac caa gca ggc agt ggg cag etc atg ttc atg aac aag tac cat 1251 
Val Tyr Gin Ala Gly Ser Gly Gin Leu Met Phe Met Asn Lys Tyr His 

320 325 330 

gga egg aag eta teg gtg cag ggc ttc aag gag gca ctt ttc cag ttc 12 99 

Gly Arg Lys Leu Ser Val Gin Gly Phe Lys Glu Ala Leu Phe Gin Phe 
25 335 340 345 

ttc cac aat ggg egg tac ctg cgc cgt gaa etc ctg ggc cct gtg etc 1347 
Phe His Asn Gly Arg Tyr Leu Arg Arg Glu Leu Leu Gly Pro Val Leu 

350 355 360 

aag aag ctg act gag etc aag gca gtg ttg gag cga cag gag tec tac 13 95 

30 Lys Lys Leu Thr Glu Leu Lys Ala Val Leu Glu Arg Gin Glu Ser Tyr 
365 370 375 

cgc ttc tac tea age tec ctg ctg gtc att tat gat ggc aag gag egg 1443 
Arg Phe Tyr Ser Ser Ser Leu Leu Val He Tyr Asp Gly Lys Glu Arg 
380 385 390 395 

35 ccc gaa gtg gtc ctg gac tea gat get gag gat ttg gag gac ctg tea 1491 
Pro Glu Val Val Leu Asp Ser Asp Ala Glu Asp Leu Glu Asp Leu Ser 

400 405 410 

gag gaa tea get gat gag tct get ggt gee tat gee tac aaa ccc ate 1539 
Glu Glu Ser Ala Asp Glu Ser Ala Gly Ala Tyr Ala Tyr Lys Pro He 
40 415 420 425 

ggc gec age tct gta gat gtg cgc atg ate gac ttt gca cac ace acc 1587 
Gly Ala Ser Ser Val Asp Val Arg Met He Asp Phe Ala His Thr Thr 

430 435 440 

tgc agg ctg tat ggc gag gac acc gtg gtg cat gag ggc cag gat get 1635 
45 Cys Arg Leu Tyr Gly Glu Asp Thr Val Val His Glu Gly Gin Asp Ala 
445 450 455 

ggc tat ate ttc ggg etc cag age ctg ata gac att gtc aca gag ata 1683 
Gly Tyr He Phe Gly Leu Gin Ser Leu He Asp He Val Thr Glu He 
460 465 470 475 

50 agt gag gag agt ggg gag tgagcttget agctgctcca gtacttgaga 1731 
Ser Glu Glu Ser Gly Glu 
480 

gcgactctgt gtcccaggma cagctgtgct gcgtcaggga ggaagccagt atggccaggt 1791 
ggtggctcct gcagcctgga gctgatgtgc agtggcctct gtgagcccca gcctgagcca 1851 
55 gtcccagctg tgcttggagt ctttatttat tttaactatt tcttcaacat tccacatttg 1911 
atgatgatac ctctttcttc cctgagtgta tatgttctaa tacaaatctt tttgtttatt 1971 
ataaaaaaaa aaaaaaaaa 1990 

<210> 175 
60 <211> 1971 
<212> DNA 
<213> Homo sapiens 
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<220> 
<221> CDS 
<222> 213 . .1274 

5 <400> 175 

ttcagcccca gccagatccc gcgtcaacgg acgcggaacg gcggaccccg taccctggca 60 

gcatcggagc accggcgggt gaaggcaagg tccctggact ggtcatatac ctcttgtggc 120 

cctggcagaa tcaagatgag gccctgtcat gcctccccag tgaggcctac agtctgagca 180 





gacagcatgg cctgccactg gcagtgaaca 


L cc 


atg 


tct 


gca 


gga 


ggt 


ggc 


egg 


233 


10 




















Met 
1 


Ser 


Ala 


Gly 


Gly 
5 


Gly 


Arg 






gcc 


ttt 


get 


tgg 


caa 


gtg 


ttc 


ccc 


ccc 


atg 


ccc 


act 


tgc 


egg 


gtc 


tat 


281 




Ala 


Phe 


Ala 


Trp 


Gin 


Val 


Phe 


Pro 


Pro 


Met 


Pro 


Thr 


Cys 


Arg 


Val 


Tyr 










10 










15 










20 










15 


ggc 


aca 


gtg 


gca 


cac 


caa 


gat 


999 


cac 


ctg 


ctg 


gtg 


ttg 


ggg 


ggt 


tgt 


329 




Gly 


Thr 


Val 


Ala 


His 


Gin 


Asp 


Gly 


His 


Leu 


Leu 


Val 


Leu 


Gly 


Gly 


Cys 








25 










30 










35 














ggc 


egg 


get 


gga 


ctg 


ccc 


ctg 


gac 


act 


get 


gag 


aca 


ctg 


gac 


atg 


gcc 


377 




Gly 


Arg 


Ala 


Gly 


Leu 


Pro 


Leu 


Asp 


Thr 


Ala 


Glu 


Thr 


Leu 


Asp 


Met 


Ala 




20 


40 










45 










50 










55 






teg 


cac 


aca 


tgg 


ctg 


gca 


ctg 


gca 


ccc 


ctg 


ccc 


act 


gcc 


egg 


get 


ggt 


425 




Ser 


His 


Thr 


Trp 


Leu 


Ala 


Leu 


Ala 


Pro 


Leu 


Pro 


Thr 


Ala 


Arg 


Ala 


Gly 














60 










65 










70 








aca 


QCt 


qcq 


gta 


gtt 


ctg 


ggc 


aag 


cag 


gtg 


eta 


gtg 


gtg 


ggt 


ggt 


gtg 


473 


25 


Ala 


Ala 


Ala 


Val 


val 


Leu 


Gly 


Lys 


Gin 


Val 


Leu 


Val 


Val 


Gly 


Gly 


Val 












75 










80 










85 










qat 


qaq 


gtc 


cag 


age 


ccg 


gta 


get 


get 


gta 


gag 


gcc 


ttc 


ctg 


atg 


gat 


521 




Asp 


Glu 


Val 


Gin 


Ser 


Pro 


Val 


Ala 


Ala 


Val 


Glu 


Ala 


Phe 


Leu 


Met 


Asp 










90 










95 










100 










30 


gag 


ggc 


cgc 


tgg 


gag 


cgt 


egg 


gcc 


acc 


etc 


cct 


caa 


gca 


gcc 


atg 


ggg 


569 




Glu 


Gly 


Arg 


Trp 


Glu 


Arg 


Arg 


Ala 


Thr 


Leu 


Pro 


Gin 


Ala 


Ala 


Met 


Gly 








105 










110 










115 














gtt 


gca 


act 


gtg 


gag 


aga 


gat 


ggt 


atg 


gtg 


tat 


get 


ctg 


ggg 


gga 


atg 


617 




val 


Ala 


Thr 


Val 


Glu 


Arg 


Asp 


Gly 


Met 


Val 


Tyr 


Ala 


Leu 


Gly 


Gly 


Met 




35 


120 










125 










130 










135 






99 c 


cct 


gac 


acg 


gcc 


ccc 


cag 


gcc 


cag 


gta 


cgt 


gtg 


tat 


gag 


ccc 


cgt 


665 




Gly 


Pro 


Asp 


Thr 


Ala 


Pro 


Gin 


Ala 


Gin 


Val 


Arg 


Val 


Tyr 


Glu 


Pro 


Arg 














140 










145 










150 








egg 


gac 


tgc 


tgg 


ctt 


teg 


eta 


ccc 


tec 


atg 


ccc 


aca 


ccc 


tgc 


tat 


ggg 


713 


40 


Arg 


Asp 


Cys 


Trp 


Leu 


Ser 


Leu 


Pro 


Ser 


Met 


Pro 


Thr 


Pro 


Cys 


Tyr 


Gly 












155 










160 










165 










gcc 


tec 


acc 


ttc 


ctg 


cac 


999 


aac 


aag 


ate 


tat 


gtc 


ctg 


ggg 


ggc 


cgc 


761 




Ala 


Ser 


Thr 


Phe 


Leu 


His 


Gly 


Asn 


Lys 


He 


Tyr 


Val 


Leu 


Gly 


Gly 


Arg 










170 










175 










180 










45 


cag 


ggc 


aag 


etc 


ccg 


gtg 


act 


get 


ttt 


gaa 


gcc 


ttt 


gat 


ctg 


gag 


gcc 


809 




Gin 


Gly 


Lys 


Leu 


Pro 


Val 


Thr 


Ala 


Phe 


Glu 


Ala 


Phe 


Asp 


Leu 


Glu 


Ala 








185 










190 










195 














cgt 


aca 


tgg 


acc 


egg 


cat 


cca 


age 


eta 


ccc 


age 


cgt 


egg 


gcc 


ttt 


get 


857 




Arg 


Thr 


Trp 


Thr 


Arg 


His 


Pro 


Ser 


Leu 


Pro 


Ser 


Arg 


Arg 


Ala 


Phe 


Ala 




50 


200 










205 










210 










215 






ggc 


tgc 


gcc 


atg 


get 


gaa 


ggc 


age 


gtc 


ttt 


age 


ctg 


ggt 


ggc 


ctg 


cag 


905 




Gly 


Cys 


Ala 


Met 


Ala 


Glu 


Gly 


Ser 


Val 


Phe 


Ser 


Leu 


Gly 


Gly 


Leu 


Gin 














220 










225 










230 








cag 


cct 


999 


ccc 


cac 


aac 


ttc 


tac 


tct 


cgc 


cca 


cac 


ttt 


gtc 


aac 


act 


953 


55 


Gin 


Pro 


Gly 


Pro 


His 


Asn 


Phe 


Tyr 


Ser 


Arg 


Pro 


His 


Phe 


Val 


Asn 


Thr 












235 










240 










245 










gtg 


gag 


atg 


ttt 


gac 


ctg 


gag 


cat 


999 


tec 


tgg 


acc 


aaa 


ttg 


ccc 


cgc 


1001 




Val 


Glu 


Met 


Phe 


Asp 


Leu 


Glu 


His 


Gly 


Ser 


Trp 


Thr 


Lys 


Leu 


Pro 


Arg 










250 










255 










260 










60 


age 


ctg 


cgc 


atg 


agg 


gat 


aag 


agg 


gca 


gac 


ttt 


gtg 


gtt 


ggg 


tec 


ctt 


1049 




Ser 


Leu 


Arg 


Met 


Arg 


Asp 


Lys 


Arg 


Ala 


Asp 


Phe 


Val 


Val 


Gly 


Ser 


Leu 








265 










270 










275 














999 


ggc 


cac 


att 


gtg 


gcc 


att 


999 


ggc 


ctt 


gga 


aac 


cag 


cca 


tgt 


cct 


1097 
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Gly 


Gly 


His 


lie 


Val 


Ala 


He 


Gly 


Gly 


280 










285 








ttg 


ggc 


tct 


gtg 


gag 


age 


ttt 


age 


ctt 


Leu 


Gly 


Ser 


Val 


Glu 


Ser 


Phe 


Ser 


Leu 



290 



300 



305 



ttg cct gee 
Leu Pro Ala 

ggg ccc egg 
10 Gly Pro Arg 
330 

gee gtg gag 
Ala Val Glu 
345 

15 tccactggag 
acactcactg 
t tggggcett 
tacctttatc 
taggccttcc 

20 tcagttccta 
tgccaggctg 
agagaataga 
aatgctggct 
tctctgtgcc 

25 tacctcacag 
aaaaaaaaaa 



atg ccc act gee cgc tgc tec tgc 
Met Pro Thr Ala Arg Cys Ser Cys 
315 320 
ctg ttt gtt att ggg ggt gtg gee 
Leu Phe Val He Gly Gly Val Ala 
335 

gca ctg tgt ctg cgt gat ggg gtc 
Ala Leu Cys Leu Arg Asp Gly Val 
350 

cagctcattg ccagaggcag ctatttctat 
tggctctgtg ggatgagaga ggcatggggg 
gggttagggg agcctttgtc tttagtgcag 
accattegtt catgaatcat gcctagctcc 
atccaactgg gaaatgggga gaagcaaagc 
tctggagttg accaggccta ccccagttgc 
cctttagggt ccctgtagac ccaggagagt 
gaggatgtgg gaactgccag agggceggag 
ttgagccctc tacactgetg gttgtatgac 
tcagcatcct catctataaa tggggatctc 
ggctgttgtg aggacccagg gagtttggat 
aaaaaaa 



Asn Gin Pro Cys Pro 
295 

egg cgc tgg gag gca 
Arg Arg Trp Glu Ala 
310 

tct agt ctg cag get 
Ser Ser Leu Gin Ala 
325 

cag ggc ccc agt caa 
Gin Gly Pro Ser Gin 
340 

tgaaggcttg gtgggagctg 



ggctcctttt 
tgagcacttg 
gacacacata 
atccttgccc 
tggcctcatg 
cattcctgaa 
tgagagggtg 
cgcaggagtt 
cttggacaag 
tgaaaccttc 
gtggaagtaa 



getgetgagg 
aaacactgcc 
tgcttacacc 
tgggacctac 
ctcttcaggg 
aaatctcagc 
ggggacacag 
caagtggagg 
tcacttcacc 
ctaccctacc 
aagtgctgcc 



1145 



1193 



1241 



1294 



1354 
1414 
1474 
1534 
1594 
1654 
1714 
1774 
1834 
1894 
1954 
1971 



<210> 176 
<211> 1613 
30 <212> DNA 

<213> Homo sapiens 

c220> 
<221> CDS 
35 <222> 68 . . 127 



<400> 176 
gacgaaggac 
ggtgact atg 
40 Met 
1 

ata aca ttt 
He Thr Phe 
15 

45 aactttgtga 
gcagaaatga 
ttagatgttg 
cccctggact 
gttcataaac 

50 cgtgcctata 
atggatggag 
aaagtgattt 
cccatagccc 
gaagaaacag 

55 aaagatgatg 
cttcttggtg 
gatagecact 
aatttgtcaa 
gttatcaact 

60 agtttccatg 
attttcttct 
ttctttggaa 
tttctctget 



tggaaggtgg cggtggtgaa ggtgcaggcc gttggggcgg ctcagaggca 
aaa ggc tta tat ttc caa cag agt tec aca gat gaa gaa 
Lys Gly Leu Tyr Phe Gin Gin Ser Ser Thr Asp Glu Glu 

5 10 
gta ttt caa taaaaggaag atcttcctgt tacagaggat 
Val Phe Gin 
20 

aacttcaagt taaagcttgt gctctgagcc agataaatac aaagcttctg 
agatgaaaaa ggatttattt cctgttggga gagaaattgc tggaattgta 
gaagcaaggt atcattcttt caaccagatg atgaagtagt tggaattttg 
ctgaagaccc tggactttgt gaagttgtta gagtacatga gcattacttg 
cagaaaaggt cacatggacg gaagcagcag gaagcattcg ggatggagtg 
cagctctgca ttatctttct catctctctc ctggaaaatc agtgctgata 
caagtgeatt tggtacaata gctattcagt tagcacatca tagaggagee 
caacagcatg cagecttgaa gataagcagt gecttgaaag attcagacct 
gagtgattga tgtatctaat gggaaagttc atgttgctga aagctgtttg 
gtggcctggg agtagatatt gtcctagatg ctggagtgag attatatagt 
aaccagctgt aaaactacaa ctactaccac ataaacatga tatcatcaca 
ttggaggcca ctgggtaaca acagaagaaa accttcagtt ggatcctcca 
gccttttcct caagggagca aegttagett tcctgaatga tgaagtttgg 
atgtacaaca gggaamaata tctttgtatc ttaaaggatg tgatggagaa 
ggtgttttca gacctcagtt ggatgaaccc attccactgt atgaggcaaa 
gaagctgttc agaaaaatca aggaagaaaa aagcaagttg ttcaatttta 
ttctcagacc teagteggat gaacatattc cagtatttga agecagaatt 
attgttgaga aaaaccaagg aagataaaac aagttgcatt tttaagcacg 
aagacaagat gctcagttga cacatttgaa aagtgtttga aaaattcttg 



60 
109 



157 



217 
277 
337 
397 
457 
517 
577 
637 
697 
757 
817 
877 
937 
997 
1057 
1117 
1177 
1237 
1297 
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tgcaaatgat 
tcattgtttc 
tttcagcaaa 
aagtaacgta 
cataatttaa 
aaaaaaaaaa 



caagataatt 
tatatatttt 
agtcacacta 
ttaattttgt 
gcatttgagt 
aaaaaa 



ctataattaa catcttaagg gaatttttct aaaaaccttt 
gcccctgcta taaaattcct tccatgaaga aaactgctgc 
ctcttgataa aagctgttgc aggcctttgc taagctatca 
atcaactccg ttctcaacac cttccttaag tctttgctgt 
atattttgaa gtcttaaaag acttagccca taggcactca 



1357 
1417 
1477 
1537 
1597 
1613 



<210> 177 
<211> 1361 
10 <212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
15 <222> 65 . . 1024 

<400> 177 

gaaggactgg aaggtggcgg tggtgaaggt gcaggccgtt ggggcggctc agaggcaggt 60 
gact atg aaa ggc tta tat ttc caa cag agt tec aca gat gaa gaa ata 109 
20 Met Lys Gly Leu Tyr Phe Gin Gin Ser Ser Thr Asp Glu Glu lie 







1 








5 










10 










15 






aca 


ttt 


ata 


ttt 


caa 


aaa 


aag 


gaa 


gat 


ctt 


cct 


gtt 


aca 


gag 


gat 


aac 


157 




Thr 


Phe 


Val 


Phe 


Gin 
20 


Glu 


Lvs 


Glu 


Asp 


Leu 
25 


Pro 


Val 


Thr 


Glu 


Asp 
30 


Asn 




25 


ttt 




aaa 


ctt 


caa 


att 


aaa 


act 


tqt 


get 


ctg 


aqc 


cag 


ata 


aat 


aca 


205 




Phe 


Val 


Lvs 


Leu 
35 


Gin 


Val 


Lvs 


Ala 


Cys 
40 


Ala 


Leu 


Ser 


Gin 


He 
45 


Asn 


Thr 






aag 


ctt 


ctg 


gca 


gaa 


atg 


aag 


atg 


aaa 


aag 


gat 


tta 


ttt 


cct 


gtt 


ggg 


253 




Lvs 


Leu 


Leu 


Ala 


Glu 


Met 


Lys 


Met 


Lys 


Lys 


Asp 


Leu 


Phe 


Pro 


val 


Gly 




30 






50 










55 










60 












aga 


gaa 


att 


get 


gga 


att 


gta 


tta 


gat 


gtt 


gga 


age 


aag 


gta 


tea 


ttc 


301 




Arq 


Glu 
65 


He 


Ala 


Gly 


He 


Val 
70 


Leu 


Asp 


Val 


Gly 


Ser 
75 


Lys 


Val 


Ser 


Phe 






ttt 


caa 


cca 


gat 


gat 


gaa 


gta 


gtt 


gga 


att 


ttg 


ccc 


ctg 


gac 


tct 


gaa 


349 


35 


Phe 
80 


Gin 


Pro 


Asp 


Asp 


Glu 
85 


Val 


Val 


Gly 


He 


Leu 
90 


Pro 


Leu 


Asp 


Ser 


Glu 
95 






gac 


cct 


gga 


ctt 


tgt 


gaa 


gtt 


gtt 


aga 


gta 


cat 


gag 


cat 


tac 


ttg 


gtt 


397 




Asp 


Pro 


Gly 


Leu 


Cys 
100 


Glu 


Val 


Val 


Arg 


Val 
105 


His 


Glu 


His 


Tyr 


Leu 
110 


Val 




40 


cat 


aaa 


cca 


gaa 


aag 


gtc 


aca 


tgg 


acg 


gaa 


gca 


gca 


gga 


age 


att 


egg 


445 




His 


Lys 


Pro 


Glu 
115 


Lys 


Val 


Thr 


Trp 


Thr 
120 


Glu 


Ala 


Ala 


Gly 


Ser 
125 


He 


Arg 






gat 


gga 


gtg 


cgt 


gee 


tat 


aca 


get 


ctg 


cat 


tat 


ctt 


tct 


cat 


etc 


tct 


493 




Asp 


Gly 


Val 


Arg 


Ala 


Tyr 


Thr 


Ala 


Leu 


His 


Tyr 


Leu 


Ser 


His 


Leu 


Ser 




45 






130 










135 










140 












cct 


gga 


aaa 


tea 


gtg 


ctg 


ata 


atg 


gat 


gga 


gca 


agt 


gca 


ttt 


ggt 


aca 


541 




Pro 


Gly 


Lys 


Ser 


Val 


Leu 


He 


Met 


Asp 


Gly Ala 


Ser 


Ala 


Phe 


Gly 


Thr 








145 










150 










155 














ata 


get 


att 


cag 


tta 


gca 


cat 


cat 


aga 


gga 


gee 


aaa 


gtg 


att 


tea 


aca 


589 


50 


He 


Ala 


He 


Gin 


Leu 


Ala 


His 


His 


Arg 


Gly Ala 


Lys 


val 


He 


Ser 


Thr 






160 










165 










170 










175 






gca 


tgc 


age 


ctt 


gaa 


gat 


aag 


cag 


tgc 


ctt 


gaa 


aga 


ttc 


aga 


cct 


ccc 


637 




Ala 


Cys 


Ser 


Leu 


Glu 
180 


Asp 


Lys 


Gin 


Cys 


Leu 
185 


Glu 


Arg 


Phe 


Arg 


Pro 
190 


Pro 




55 


ata 


gee 


cga 


gtg 


att 


gat 


gta 


tct 


aat 


ggg 


aaa 


gtt 


cat 


gtt 


get 


gaa 


685 




He 


Ala 


Arg 


Val 
195 


He 


Asp 


Val 


Ser 


Asn 
200 


Gly 


Lys 


Val 


His 


Val 
205 


Ala 


Glu 






age 


tgt 


ttg 


gaa 


gaa 


aca 


ggt 


ggc 


ctg 


gga 


gta 


gat 


att 


gtc 


eta 


gat 


733 




Ser 


Cys 


Leu 


Glu 


Glu 


Thr 


Gly 


Gly 


Leu 


Gly Val 


Asp 


He 


Val 


Leu 


Asp 




60 






210 










215 










220 












get 


gga 


gtg 


aga 


tta 


tat 


agt 


aaa 


gat 


gat 


gaa 


cca 


get 


gta 


aaa 


eta 


781 




Ala 


Gly 


Val 


Arg 


Leu 


Tyr 


Ser 


Lys 


Asp 


Asp 


Glu 


Pro 


Ala 


val 


Lys 


Leu 





225 230 235 



186 
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caa eta eta cca cat aaa cat gat ate ate aca ctt ctt ggt gtt gga 
Gin Leu Leu Pro His Lys His Asp lie lie Thr Leu Leu Gly Val Gly 
240 245 250 255 

ggc cac tgg gta aca aca gaa gaa aac ctt cag ttg gat cct cca gat 
5 Gly His Trp Val Thr Thr Glu Glu Asn Leu Gin Leu Asp Pro Pro Asp 

260 265 270 

age cac tgc ctt ttc etc aag gga gca acg tta get ttc ctg aat gat 
Ser His Cys Leu Phe Leu Lys Gly Ala Thr Leu Ala Phe Leu Asn Asp 
275 280 285 

10 gaa gtt tgg aat ttg tea aat gta caa cag gga aaa tat ctt tat ctt 
Glu Val Trp Asn Leu Ser Asn Val Gin Gin Gly Lys Tyr Leu Tyr Leu 

290 295 300 

aaa gga tgt gat gga gaa gtt ate aac tgg tgt ttt cag acc tea gtc 
Lys Gly Cys Asp Gly Glu Val lie Asn Trp Cys Phe Gin Thr Ser Val 
15 305 310 315 

gga tgaacatatt ccagtatttg aagccagaat tttctttgga aattgttgag 

Gly 

320 

aaaaaccaag gaagataaaa caagttgeat ttttaagcac gtttctctgc taagacaaga 
20 tgctcagttg acacatttga aaagtgtttg aaaaattctg gcttctaatc ctgcctctgt 
tcccttttct ctccttgaaa gtccagcaca ccattcttgt ccttccccag tttcctcgcc 
ctccacccct ccagcttcat gctcagtgtt gtgettaata aaatggacat atttttctct 
aaaaaaaaaa aaaaaakaaa aaaaaaaaat aaaaaaaaaa aaaaaaa 



829 



877 



925 



973 



1021 



1074 



1134 
1194 
1254 
1314 
1361 



25 <210> 178 
<211> 1113 
<212> DNA 
<213> Homo sapiens 

30 <220> 

<221> CDS 
<222> 109 . .585 



35 



40 



45 



50 



55 



60 



<400> 178 

gegggacegg acttccggct ggtctgtggg gtttcgggtt 
caggggcagg caacagagtg gcggccgcta cggccctgga 



ctg egg cga 
Leu Arg Arg 
5 

gcg cag gtc 
Ala Gin Val 
20 

tgg ttt gee 
Trp Phe Ala 



act gga 
Thr Gly 

tat acc 
Tyr Thr 

gga cct 
Gly Pro 

85 
gca aca 
Ala Thr 
100 

ctt tgg 
Leu Trp 



ttg 
Leu 

etc 

Leu 

70 

gtg 

Val 

att 

He 

tgg 
Trp 



ttc ttg tea 



gtc ctg 
Val Leu 

ctg gat 
Leu Asp 

ate tgc 
He Cys 

40 
ctg tgg 
Leu Trp 
55 

ggc aat 
Gly Asn 

aag caa 
Lys Gin 

gtt atg 
Val Met 

cat aag 
His Lys 
120 
atg acc 



age 
Ser 

gee 

Ala 

25 

ttc 

Phe 

ctt 
Leu 

ctt 
Leu 

ctg 
Leu 

ctt 
Leu 
105 
aag 
Lys 

tgg 



ggc cag gac gac 
Gly Gin Asp Asp 
10 

tea tec ctt agt 
Ser Ser Leu Ser 



gta 
Val 

ccg 
Pro 

get 
Ala 

aag 

Lys 

90 

ttg 

Leu 



tgt 
Cys 

ggc 
Gly 

gcg 

Ala 

75 

aaa 

Lys 



ggc gtt 
Gly Val 

45 
ggc ata 
Gly He 
60 

tta gee 
Leu Ala 

atg ttt 
Met Phe 



tgt ttc ata 
Cys Phe He 



gga ctg get gtg 
Gly Leu Ala Val 
125 

tat age ctg teg 



gag 
Glu 

ttc 

Phe 

30 

ttc 

Phe 

aag 
Lys 

agt 
Ser 

gaa 
Glu 

ttt 
Phe 
110 
tta 
Leu 

tac 



eggggtttec tggtgggcgt 
aeggggee atg gag aag 

Met Glu Lys 

1 

gag cag ggc ctg act 
Glu Gin Gly Leu Thr 
15 

aac acc aga ttg aaa 
Asn Thr Arg Leu Lys 
3 5 

ttt tct att ctt gga 
Phe Ser He Leu Gly 
50 

ctt ttt gca gtg ttt 
Leu Phe Ala Val Phe 
65 

aca tgc ttt tta atg 
Thr Cys Phe Leu Met 
80 

gca aca aga ttg ctt 
Ala Thr Arg Leu Leu 
95 

acc ctg tgt get get 
Thr Leu Cys Ala Ala 
115 

ttc tgc ata ttg cag 
Phe Cys He Leu Gin 
130 

ate cca tat gca agg 



60 
117 



165 



213 



261 



309 



357 



405 



453 



501 



549 



187 
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10 



15 



20 



Phe Leu Ser Met Thr Trp Tyr Ser Leu Ser Tyr 

135 140 
gat gca gtt att aaa tgc tgt tct tct etc eta 
Asp Ala Val lie Lys Cys Cys Ser Ser Leu Leu 

150 155 
aaacattgtg gaaaagagca cttgaatgta tggtactcta 
tccccataaa acactccagg aacaactgac gtgacagttg 
ctcattttgt atactggtaa aaactacatg cttgattaaa 
taaattcatt atgtgtcatt aatatacttt tccaaagata 
ttgtaaatta tttttageca atttttaaat cttttcaaag 
ttaaaggtag acctcgtgct gcaagataat taaacttttt 
atttttaaga ttttttttac tttaaatgtg aaacttattt 
ttatatgtaa taaaaataat atataaatct ttacaatktt 
aaaaataaaa aaaaaaaaaa agaaaaaaaa aaaaaaaa 

<210> 179 

<211> 1960 

<212> DNA 

<213> Homo sapiens 



lie Pro Tyr Ala Arg 
145 

agt tgaaaatcag 
Ser 



tgtttggtga 
aagaccgttt 
ccattaaatg 
agatttttaa 
cagctttgaa 
tgcttttaaa 
taagctagaa 
tgaaataaac 



agtttgcttt 
tgtactaagt 
cttgtaactt 
tcactgccag 
atgtgaatat 
aaatgtctgc 
amattgetta 
ccatccttgg 



<220> 
<221> CDS 
<222> 29. . 



595 



655 
715 
775 
835 
895 
955 
1015 
1075 
1113 



577 



25 <400> 179 

atcggccaac ggacgegagg cgcgcgcc atg gaa cag egg tta get gag ttt 

Met Glu Gin Arg Leu Ala Glu Phe 
1 5 
egg gcg gcg egg aaa egg gcg ggt ctg gcg gee caa ccc cct get gee 
30 Arg Ala Ala Arg Lys Arg Ala Gly Leu Ala Ala Gin Pro Pro Ala Ala 
10 15 20 

agt cag ggc gca caa ace cca gga gag aag gcg gaa gca gca gcg act 
Ser Gin Gly Ala Gin Thr Pro Gly Glu Lys Ala Glu Ala Ala Ala Thr 
25 30 35 40 

35 eta aag gca gee cca ggc tgg eta aag egg ttc ctg gta tgg aaa cct 
Leu Lys Ala Ala Pro Gly Trp Leu Lys Arg Phe Leu Val Trp Lys Pro 

45 50 55 

agg ccc gcg agt gee egg gee cag ccc ggc eta gtt cag gaa gcg get 
Arg Pro Ala Ser Ala Arg Ala Gin Pro Gly Leu Val Gin Glu Ala Ala 
40 60 65 70 

cag ccc cag ggc age aca tea gag aca cca tgg aac aca gee att cct 
Gin Pro Gin Gly Ser Thr Ser Glu Thr Pro Trp Asn Thr Ala lie Pro 

75 80 85 

ctg ccg teg tgc tgg gac cag tct ttc ctg ace aat ate acc ttc ttg 
45 Leu Pro Ser Cys Trp Asp Gin Ser Phe Leu Thr Asn lie Thr Phe Leu 
90 95 100 

aag gtt ctt etc tgg ttg gtc ctg ctg gga ctg ttt gtg gaa ctg gaa 
Lys Val Leu Leu Trp Leu Val Leu Leu Gly Leu Phe Val Glu Leu Glu 
105 110 115 120 

50 ttt ggc ctg gca tat ttt gtc ctg tec ttg ttc tat tgg atg tac gtc 
Phe Gly Leu Ala Tyr Phe Val Leu Ser Leu Phe Tyr Trp Met Tyr Val 

125 130 135 

ggg aca cga ggc cct gaa gag aag aaa gag gga gag aag age gee tac 
Gly Thr Arg Gly Pro Glu Glu Lys Lys Glu Gly Glu Lys Ser Ala Tyr 
55 140 145 150 

tct gtg ttc aat cca ggc tgt gaa gee ate cag ggc acc ctg act gca 
Ser Val Phe Asn Pro Gly Cys Glu Ala lie Gin Gly Thr Leu Thr Ala 

155 160 165 

gag cag ttg gag cgc gag tta cag ttg aga ccc ctg gca ggg aga 
60 Glu Gin Leu Glu Arg Glu Leu Gin Leu Arg Pro Leu Ala Gly Arg 
170 175 180 

taggacccag ctgtgctgtc atgcagctaa cctctgatgt ggtcttcctc accattggct 
atggatttga tttcaggtgt ataggactaa gggcagcttg egggttaget ctgtgactgc 



52 



100 



148 



196 



244 



292 



340 



388 



436 



484 



532 



577 



637 
697 



188 
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atagtttttc 
tgaaactctc 
aagaagggag 
gaaaggcact 
5 catcagggct 
tttcccggag 
gctttaattc 
tctcctcttc 
ttttgccact 

10 gagggatttc 
aggggagggc 
ctaaagcagc 
tccatggacg 
tctcataagg 

15 ggctttagag 
gataatcagg 
aaaattaggc 
actgtgtttc 
ttggccattg 

20 atctcacaga 
gggaaacaga 
aaa 



taccttcttt 
taaaatacat 
acctgtttgt 
ggggagattc 
gcattgtcaa 
agtttaggtt 
tttaattttc 
agaaataggt 
actcagtaaa 
acaagtggtt 
tggaagtgtt 
agtccctaac 

gggtggtggt 

agcacgcaac 
taagcagctt 
ctggatttta 
aaactcccat 
agcccttgct 
ggagtttggc 
aaagtctttt 
tgaatcccta 



ccctgatctt 
tcactgtggg 
ttcatttctc 
tcagcttaaa 
tgttctcttt 
gcaagttttg 
agtcattact 
ccctcttcat 
ggaaggtagg 
attaacggcc 
aactccagac 
ctttttggca 
ggaggatgaa 
ctagatccct 
tttcacctgt 
atgctgcttt 
tgggttaggg 
gaaaattctt 
tgtccctcag 
cttccatgag 
ttaaacatga 



ttgctgccat 
tccgacgcaa 
atctgtttgg 
acatccagca 
aagtctttta 
ggtttcttgt 
ggtattgaaa 
tgcccatcac 
aagagacaaa 
agttcagcaa 
ccgttggctg 
ccagggacca 
acttccacct 
cgcatgcgca 
gggcctctgg 
tccagtacaa 
cttctctcat 
ctgatatgtg 
agccatccgg 
ttctgtctga 
agttttgatt 



ttgatctttg 
tttataaaaa 
gagatgattt 
gtttgaagta 
acatttatag 
ttgtttttgt 
aataaaatat 
catcttccac 
cgcctaagtg 
gaagtgttga 
cttgagttgt 
gttttgtgga 
cagatcatca 
gttcacaata 
tgagaaattc 
tgttagagtt 
tccattttgt 
ttgcccttcc 
tcaagcagat 
actgaacatg 
gtattcaaaa 



atagttttgg 
ttatgtactc 
tagagcacta 
tgattaggta 
caattttttt 
tttgcttcct 
ctttaaaaca 
tctcctatta 
caggtgtggg 
gtgtgtacaa 
ttcttatatt 
acacagtttt 
ggcattagag 
cggttctaag 
tgtaaattgt 
tgggttcatt 
ggctaacctt 
tcacagccct 
ggtctgttct 
taaaaagtat 
aaaaaaaaaa 



757 
817 
877 
937 
997 
1057 
1117 
1177 
1237 
1297 
1357 
1417 
1477 
1537 
1597 
1657 
1717 
1777 
1837 
1897 
1957 
1960 



<210> 180 

25 <211> 1443 

<212> DNA 

<213> Homo sapiens 

<220> 

30 <221> CDS 

<222> 23. .451 

<400> 180 

accggcgggc ggggcgggt 



35 



40 



45 



50 



55 



60 



tec 


aca 


gtg 


etc 


Ser 


Thr 


Val 


Leu 


gga 


acg 


tac 


tac 


Gly 


Thr 


Tyr 


Tyr 








30 


tat 


aaa 


agt 


cag 


Tyr 


Lys 


Ser 


Gin 






45 




ctt 


get 


ctg 


ctg 


Leu 


Ala 


Leu 


Leu 




60 






ctg 


ggc 


acc 


agg 


Leu 


Gly 


Thr 


Arg 


75 








age 


ctg 


gec 


etc 


Ser 


Leu 


Ala 


Leu 


ctt 


tgg 


cag 


gec 


Leu 


Trp 


Gin 


Ala 








110 


etc 


ctg 


gec 


ctt 


Leu 


Leu 


Ala 


Leu 






125 




gcg 


gec 


ttc 


acc 


Ala 


Ala 


Phe 


Thr 




140 







a ag atg gcg gee ccg egg cga ggg aga gga tec 52 
Met Ala Ala Pro Arg Arg Gly Arg Gly Ser 
15 10 
tct tea gtt ccc ctt caa atg ctg ttt tat etc age 100 
Ser Ser Val Pro Leu Gin Met Leu Phe Tyr Leu Ser 
15 20 25 

gee ctg tat ttc etc gee acg etc ctg atg ate acg 148 
Ala Leu Tyr Phe Leu Ala Thr Leu Leu Met lie Thr 

35 40 
gtg ttc age tat cct cac cgc tac ctg gtc etc gat 196 
Val Phe Ser Tyr Pro His Arg Tyr Leu Val Leu Asp 

50 55 
ttt ctg atg ggg att eta gaa gca gtt egg tta tac 244 
Phe Leu Met Gly He Leu Glu Ala Val Arg Leu Tyr 

65 70 
ggc aac ctg aca gag get gag agg ccg ctg gec gee 292 
Gly Asn Leu Thr Glu Ala Glu Arg Pro Leu Ala Ala 

80 85 90 

acg get ggc acc gee etc etc tct gee cac ttc ctg 340 
Thr Ala Gly Thr Ala Leu Leu Ser Ala His Phe Leu 
95 100 105 

eta gtg ttg tgg gcg gac tgg gec etc age gee acg 3 88 

Leu Val Leu Trp Ala Asp Trp Ala Leu Ser Ala Thr 

115 120 
cac ggc ctg gag gee gtc ctg cag gtg gtt gee ate 436 
His Gly Leu Glu Ala Val Leu Gin Val Val Ala He 

130 135 
agg tagctaegga cacccgggat accccacact ggggccctcc 4 91 



189 
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tcctgggcct 
ttcctagtga 
gcaggagcca 
ggtagggacc 
5 cagtggggtc 
gagggaggac 
tagttcccca 
cacctgtgtt 
accccagctc 

10 agcaataaat 
ttctcccttc 
agagagagca 
gcttttgcac 
actatcagtt 

15 ctattgtgtg 
accgcatttt 



gaccagtccc 
ctggccatag 
gggcagaaca 
agacagaact 
tggctttagc 
agagcccttc 
atggtcctaa 
tccaagtcgg 
tgcctcacag 
gcaaacaagc 
aggggcttcg 
gcggagggcc 
ggagtgctaa 
ctccttaaaa 
ttttctatgt 
gtaaataaag 



<210> 181 

<211> 605 

20 <212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

25 <222> 232 . .450 



ccagctgtca 
atggttttgg 
aactgctgga 
gccttcaaga 
agccaggcct 
agaacagagg 
tttgtgttct 
gctggagacg 
gcaggcaggc 
caacagctct 
gaggagaggt 
acattcggag 
acaaattcta 
agtatctaag 
ttggaataat 
agatgtgtat 



cctccccatt 
atggttccat 
ggccctggtg 
tgagtcccag 
ccacagaccc 
cctcatctca 
gagatcccag 
caggatgggg 
ccggtgcaag 
gctgcctagc 
cagggctaag 
cctccgtcca 
gctctgtgtt 
ctgttacagt 
tacacccaaa 
gcctcaaaaa 



cctggacagg 
ctgttctggc 
ttgggaacag 
gagcgcacac 
ccatgggccc 
ctgcatcccc 
tttactctgt 
taggccttgt 

agtggactct 

aatttccatc 
gccggggatg 
ctccagtttt 
tttttcccat 
agctttccct 
tatctagata 
aaaaaaaaaa 



aagggcactt 
aggagtggga 
ctgcggggag 
tcagccctgt 
ccagggccga 
catcaccccc 
ggccaggccc 
gctctgagca 
gggttcctaa 
ttagccacac 
atactgcagg 
atcagctttt 
tcccagattt 
tcacttgatt 
ttttctcttc 
aa 



551 
611 
671 
731 
791 
851 
911 
971 
1031 
1091 
1151 
1211 
1271 
1331 
1391 
1443 



<400> 181 

caaatacaaa tgccccaaga agactgagga taggagaaag aatatctcta cctgtgaaac 
attgttagac tgcctggcta ggagttcatt gttgttttct gaaggacgta accaaccact 
30 ccaaaactta caggcttaaa acaacaaaca tgtatcattt cttatgattc tgtgggttgg 
ctgggtggtt cttctagctg aggcaggatg gtctaggata gctacatcca c atg tct 

Met Ser 
1 

999 9tc cca get gag atg act ggg get gtt gag gee ttt etc cct gtg 
35 Gly Val Pro Ala Glu Met Thr Gly Ala Val Glu Ala Phe Leu Pro Val 
5 10 15 

gtg tea tec tec aga agg ctg ccc aga ttt gtc cat atg gta gca gga 
Val Ser Ser Ser Arg Arg Leu Pro Arg Phe Val His Met Val Ala Gly 
20 25 30 

40 gtt tec teg aag caa gag agg gca aga tec aac aca gaa gca ctt ttc 
Val Ser Ser Lys Gin Glu Arg Ala Arg Ser Asn Thr Glu Ala Leu Phe 
35 40 45 50 

aag etc tgt ttc cat cac att tgc caa tgt etc act gat gaa cac aag 
Lys Leu Cys Phe His His lie Cys Gin Cys Leu Thr Asp Glu His Lys 
45 55 60 65 . 

ttc cat ggc caa gtc cag ttt taagaaatgg agaaataggg cttggctcag 
Phe His Gly Gin Val Gin Phe 
70 

tggctcatgt ctgtaatccc agcactttgg gaggecaagg catgeggate atttgaggtc 
50 aggagttcca gaccagcctg gecaacatgg tgaaaaccca tctctaccaa aaaaaaaaaa 
aaaaa 



60 
120 
180 
237 



285 



333 



381 



429 



480 



540 
600 
605 



<210> 182 

<211> 1724 

55 <212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

60 <222> 758 . . 1183 



<400> 182 

aactaaagee gggagtcegg tgaaegggea gaagcagggc catgcccaag ccacccccaa 

190 



60 
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gatccccctg aacctgcacc tccatcacga cccattcagg agcctccagg agcccagaca 
ccagcccccc accatggtaa gtccttcaag ggtgggatct ggaagaggaa agaggaggga 
caccagccag gtggaggtgt cctaaaaatg accatcagaa attggggtga ggggaggggc 
atggtggact tctgtggggg tggggtgtct ctcagtgcag ctcaggtgcc tccagcatcc 
5 cttaccaggg agcaagctcc catctgtagg tggtggggat gccagggtgg tatccctgga 
tccaaggata gggcaggacc tggaagacag aaggtggccc agggagaatc acagagtctg 
cagggacaag gacatagcct cctttgcttg caaattaagg gagccctttc ccagtccagc 
ccagtctctc gtctccctgt gtagccttgg gctagtcact tcccctctct tggccccggt 
tcccacagat gtcatatttg gaaatccgtc tagatgcgga agttgctctt caggggtctt 

10 tcagttgcaa cattctcaag gtctgtgggt tctgccacag agtcctcggc tgagatggga 
agctatgtct aacaagcgat ggggtggatt gacgccctcc ctgtgccggt gacgggcggt 
atggctgcag cagaggcagg agaggctgaa tacgtcc atg cca ccc ttt ggt ggg 

Met Pro Pro Phe Gly Gly 
1 5 

15 cat ccc tta tec caa gag gag gat ggc age cag agg tgt tgc tgc ctg 
His Pro Leu Ser Gin Glu Glu Asp Gly Ser Gin Arg Cys Cys Cys Leu 

10 15 20 

tea agt ctg agg tct gtc gat gat age aac ggg gag act gtc gtg ate 
Ser Ser Leu Arg Ser Val Asp Asp Ser Asn Gly Glu Thr Val Val He 

20 25 30 35 

atg gcg eta ttc eta gca gta teg tac cac cat aag acg caa agt aag 
Met Ala Leu Phe Leu Ala Val Ser Tyr His His Lys Thr Gin Ser Lys 

40 45 50 

agg tgg cca ggg ctg acc cca ccc cac age tct ctg ctg tgt aga cca 

25 Arg Trp Pro Gly Leu Thr Pro Pro His Ser Ser Leu Leu Cys Arg Pro 
55 60 65 70 

ctt cag ctt tea ttt etc gtc att cag tea gtg agg atg aga gca tgt 
Leu Gin Leu Ser Phe Leu Val He Gin Ser Val Arg Met Arg Ala Cys 
75 80 85 

30 ggc tgt gac age ggc cac tgc agg att ctt ggc agg tac age tta eta 
Gly Cys Asp Ser Gly His Cys Arg He Leu Gly Arg Tyr Ser Leu Leu 

90 95 100 

ggg tgg agt cag gga cat agg gca aga ggc aga ggt ggt gtt agt ctg 
Gly Trp Ser Gin Gly His Arg Ala Arg Gly Arg Gly Gly Val Ser Leu 

35 105 110 115 

aga gac aac acc ttc ttt cag gaa gee agt gag ggc cag gga cag tgg 
Arg Asp Asn Thr Phe Phe Gin Glu Ala Ser Glu Gly Gin Gly Gin Trp 

120 125 130 

etc atg cct gta ate cca gca ttt taggaggctg agacaggtag atcacttgag 

40 Leu Met Pro Val He Pro Ala Phe 
135 140 

gtcaggtgtt cgagaccagc ctggccaacg tggtgaaacc tcgtctctac taaaaaatac 
aaaaaattaa ctgggcgtgg tggcacacgc ctgtaatccc agctacatat gaggctgagg 
caagagaata acttgaaccc aggaggegga gggtgcagtg agctgagatc ctgccgctgc 

45 actccagcct gggtgacaga gcacactccg tctcaaaaaa ggaaagctga tgagaaattg 
ggcatcccgg aattcacacc caaaccatca gctggagctc tgagactgtt ggggtgggaa 
ttcttccaag atgagaagca agecagggag gctcaggtcc tgggatgggc agggctttga 
tcaaaagaac acaggaagtg atttgetact tgaaagaaag gcaacccctc cccaaggaag 
ccctctgaaa atgcttagtc aacagtegge ttggcagaca aggtctggga ggggccaccc 

50 gtategcaga ggacaaaaaa aaaaaaaaaa a 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
775 



823 



871 



919 



967 



1015 



1063 



1111 



1159 



1213 



1273 
1333 
1393 
1453 
1513 
1573 
1633 
1693 
1724 



<210> 183 
<211> 1686 
<212> DNA 
55 <213> Homo sapiens 



60 



<220> 
<221> CDS 
<222> 486. 



932 



<400> 183 

cggctcactg cagcctcttc ctcccagtct caagtgattc tcctgtctca gcctcctgag 
tggctgggat tacaggtgtg caccactacc acttggctaa tttttatact tttagtagag 

191 



60 
120 
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atggggtttc accatgttgg ccaggctggc cttgaactcc tgacctcagg tgatccgccc 
gcctcggcct cccaaagtgc tggggttaca ggcatgagcc accgcacccg gcccccttcc 
ttcgtcttag tcaatcctat cccacctctt cttccaccag tcccctcacc tgatggtccc 
aacacttcat catccaccac ctcctggagg gggtaccccg aggtgctccg ctggggactc 
5 tgctcattct gggggtgcag ttgacggctg gtcgtgatct ttcccgtaat ctgtcccctc 
ttacggaacc tagtctccgt tctgtccatg gccttcttct ggacactgct aggatccaga 
agagt atg tta tea att etc aag cct agg aga. agt cag gag tgg aga aca 
Met Leu Ser lie Leu Lys Pro Arg Arg Ser Gin Glu Trp Arg Thr 
15 10 15 

10 get ctg aga aga tac tgt tgt cca act gat etc cag gca cca egg agt 
Ala Leu Arg Arg Tyr Cys Cys Pro Thr Asp Leu Gin Ala Pro Arg Ser 

20 25 30 

cc 9 9 tc cct cca atc a 99 aa 9 9 tc 99 a atc tct 9 a t gtc ate gtt cat 
Pro Val Pro Pro lie Arg Lys Val Gly lie Ser Asp Val lie Val His 
15 35 40 45 

gee aac ctg gca ace agt ttg aaa aaa aac aca tgt aac tgc cag get 
Ala Asn Leu Ala Thr Ser Leu Lys Lys Asn Thr Cys Asn Cys Gin Ala 

50 55 60 

gat etc ttg tec tgg aga tec tgg gtg aat ggt atc tec tgc cac tgt 
20 Asp Leu Leu Ser Trp Arg Ser Trp Val Asn Gly lie Ser Cys His Cys 
65 70 75 

ccc aac etc aga cca ttg tec aaa age atc ttc agg gac tec aca tec 
Pro Asn Leu Arg Pro Leu Ser Lys Ser lie Phe Arg Asp Ser Thr Ser 
80 85 90 95 

25 etc tgt tec ctg tec cag cag agg ctg tgt cct etc cac tea aag cct 
Leu Cys Ser Leu Ser Gin Gin Arg Leu Cys Pro Leu His Ser Lys Pro 

100 105 110 

gaa gca tgt tgg ggt etc ttt gtc tct gta cat gee cat ttc aga gtc 
Glu Ala Cys Trp Gly Leu Phe Val Ser Val His Ala His Phe Arg Val 
30 115 120 125 

cag get ggt ggg aga ggg aac aga gtg gga aag aaa act agg gta age 
Gin Ala Gly Gly Arg Gly Asn Arg Val Gly Lys Lys Thr Arg Val Ser 

130 135 140 

aga aac gat gaa ace tta taagagtgag attatcatgt gcaagagtga 
35 Arg Asn Asp Glu Thr Leu 
145 

gattatcatg tacaagagat cccaggaaat actgactttg atgaaaaagt cacatcagag 
cactcagttt tggcagagct ttttctgccg aatgtttact cacattcact gtccgagatt 
ctatactggg ggtacacacg tcctctgccc taaggcaatt ttgagtccaa gagacatttt 

40 gaggectaaa aatcatagga aactgcccct gagctcacac atatttccaa tggtgtcccc 
aatttcaggg aatccatgga ttacctaagc cagcccctcc agttcggcta agaaactcta 
gtctatatgt caagttttgt atcatatgta ttgctctgaa ctcagaaatt tcccttccat 
ttatggattc tatgaataaa atatcacatg tacaaaaaga ctaagtcaaa aaatttcagc 
tgtgcacagt ggctcatget tgtaatccca gcactttggg tggccgaggg gggaggattg 

45 cctgaggcca gcagttcaag accagtatgg gcaacatggc aagageccat ctctaaaaaa 
acaaaaccaa accaaattgg ccaggtgtgg tggctggcac ctgtgttcca actacttggg 
agactcatgt gacaggaaga tcacttgagc ccgggggtta gaggctgeag tgagctatga 
tcttgccact gcactccagc ctgggtgaca gagegagaca ccgtcgcaaa aaaaaaaaaa 
aaaa 



50 



55 



<210> 184 

<211> 463 

<212> DNA 

<213> Homo sapiens 



180 
240 
300 
360 
420 
480 
530 



578 



626 



674 



722 



770 



818 



866 



914 



962 



1022 
1082 
1142 
1202 
1262 
1322 
1382 
1442 
1502 
1562 
1622 
1682 
1686 



<220> 
<221> CDS 
<222> 80 . 



304 



60 <400> 184 

cttttaacag ctgaggtctc tctttaattc 
ccattagatc atttcacaa atg tat ctg 

Met Tyr Leu 



tcttaaatac catttctccc tcaaaaaaga 60 
cca cca aac agg tea gag ctt tgc 112 
Pro Pro Asn Arg Ser Glu Leu Cys 

192 
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10 



15 



aac ttt get 
Asn Phe Ala 

gtg gaa aag 
Val Glu Lys 
30 

cca tea ctt 
Pro Ser Leu 
45 

att tta gtg 
lie Leu Val 
60 

tgatcagtag 
gtcctgtttc 
ttatcaatta 



1 5 
ttg tct ctt aac etc tat ggc aaa 
Leu Ser Leu Asn Leu Tyr Gly Lys 
15 20 
cat aac age agg gat tta gaa gat 
His Asn Ser Arg Asp Leu Glu Asp 
35 

tea tct cca tea cac ccg gac tgg 
Ser Ser Pro Ser His Pro Asp Trp 
50 

gca acc ctg ggg gaa ctt gat acc 
Ala Thr Leu Gly Glu Leu Asp Thr 
65 70 
tt 999 a 9 a 99 taggaattgg tgagtacagg 
ccccctttta attttatccc ttgctagaat 
cagtctaaat ccaaaagaaa aaaaaaaaa 



10 

ggg ttt ttt age ctg 
Gly Phe Phe Ser Leu 
25 

aga get agt tct ggc 
Arg Ala Ser Ser Gly 
40 

ggt tat ata gtt ctg 
Gly Tyr lie Val Leu 
55 

cag gta ggt ggt cac 
Gin Val Gly Gly His 
75 

taattagagg aaagtcttgt 
taagatacta tatgcctcac 



160 



208 



256 



304 



364 
424 
463 



<210> 185 

<211> 773 

20 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
25 <222> 188 . .691 



<400> 185 

agttgcgggt tgcaggagtt caggaaagga 
tacagtaaca atggcagcct ttttgttgct 
30 aaaggactct ggattggttg geagtctget 
agaagaa atg agg tta aca gaa aag 
Met Arg Leu Thr Glu Lys 
1 5 
ccc aac aac tct aat gca ccc aat 
35 Pro Asn Asn Ser Asn Ala Pro Asn 
15 20 
cag tea gaa cag cat act cca gca 
Gin Ser Glu Gin His Thr Pro Ala 
35 

40 aca cag cca tec aga tgt cga ttg 
Thr Gin Pro Ser Arg Cys Arg Leu 
50 

tec age gac aga acg ate aac ctt 
Ser Ser Asp Arg Thr lie Asn Leu 

45 65 70 

gag tgg att ttc aac ccc tat cga 
Glu Trp lie Phe Asn Pro Tyr Arg 

80 85 
cct gaa ttt ctt ctg gtg ttt aaa 

50 Pro Glu Phe Leu Leu Val Phe Lys 
95 100 
tgt ctg aaa gee cag atg gaa aag 
Cys Leu Lys Ala Gin Met Glu Lys 
115 

55 etc ttc gca etc tec acc etc tac 
Leu Phe Ala Leu Ser Thr Leu Tyr 
130 

att ctt tec ctt tct ttc ttt att 
lie Leu Ser Leu Ser Phe Phe lie 
60 145 150 

ttt att att gtc ttc att ctg ate 
Phe lie lie Val Phe lie Leu lie 
160 165 



ggtgggacta gagtcaacct ggaatagctc 
gggacatcca tacaggcaac ttagctggtg 
tttttttttc caaggtgatc actttactgt 
agt gag gga gaa caa caa etc aag 
Ser Glu Gly Glu Gin Gin Leu Lys 
10 

gaa gat caa gaa gaa gaa ate caa 
Glu Asp Gin Glu Glu Glu lie Gin 

25 30 
agg cag cga aca caa aga gca gac 
Arg Gin Arg Thr Gin Arg Ala Asp 

40 45 
cct tea cgt agg aca cct aca aca 
Pro Ser Arg Arg Thr Pro Thr Thr 



55 



60 



ctt gaa gtc ctt ccg tgg cct act 
Leu Glu Val Leu Pro Trp Pro Thr 
75 

ttg cct get ctt ttt gag ctt tat 
Leu Pro Ala Leu Phe Glu Leu Tyr 
90 

gaa gee ttc cat gac ata tec cat 
Glu Ala Phe His Asp lie Ser His 
105 110 
ate gga ctg ccc ate ata etc cac 
lie Gly Leu Pro lie lie Leu His 

120 125 
ttc tac aag ttt ttc ctt cct aca 
Phe Tyr Lys Phe Phe Leu Pro Thr 
135 140 
ctt ctt gta ctt ctg ctt ctg ctt 
Leu Leu Val Leu Leu Leu Leu Leu 
155 

ttc ttc tgattctttt gtttcaataa 
Phe Phe 



60 
120 
180 
229 



277 



325 



373 



421 



469 



517 



565 



613 



661 



711 



193 
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acagcaatga gcatgaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 771 
aa ~ ~ 773 

<210> 186 
5 <211> 753 
<212> DNA 
<213> Homo sapiens 

<220> 
10 <221> CDS 

<222> 94 . .573 



<400> 186 

acttttcagg ggacattcag aggcatcagc cccttcctcc tcaccagctc ccagagttcc 
15 catctccatc cccaatccta aagaaggaaa teg atg cca egg tec tea agg age 

Met Pro Arg Ser Ser Arg Ser 
1 5 
cct ggg gac cca ggc gec eta etc gaa gat gtg gee cac aat ccc aga 
Pro Gly Asp Pro Gly Ala Leu Leu Glu Asp Val Ala His Asn Pro Arg 
20 10 15 20 

ccc egg agg att gee cag cga ggc egg aac ace age agg atg gca gag 
Pro Arg Arg lie Ala Gin Arg Gly Arg Asn Thr Ser Arg Met Ala Glu 

25 30 35 

gac ace tec cca aac atg aat gac aac ate ctg ttg cct gtc cgc aac 
25 Asp Thr Ser Pro Asn Met Asn Asp Asn lie Leu Leu Pro Val Arg Asn 
40 45 50 55 

aat gac caa gee eta ggc ctg act cag tgc atg ctg gga tgt gtg tec 
Asn Asp Gin Ala Leu Gly Leu Thr Gin Cys Met Leu Gly Cys Val Ser 
60 65 70 

30 tgg ttc acc tgt ttt gee tgc tec ctg aga act cag gee cag cag gtt 
Trp Phe Thr Cys Phe Ala Cys Ser Leu Arg Thr Gin Ala Gin Gin Val 

75 80 85 

ctg ttt aac acg tgc aga gac aga gtt tea cca tgt tgc cca ggc tgg 
Leu Phe Asn Thr Cys Arg Asp Arg Val Ser Pro Cys Cys Pro Gly Trp 
35 90 95 100 

tct caa act cca gtg ate etc cca cct cag cct tec gaa gtg ctg gga 
Ser Gin Thr Pro Val lie Leu Pro Pro Gin Pro Ser Glu Val Leu Gly 

105 110 115 

tta cag atg caa get get gtg cca gaa get cat gga gaa gac agg cat 
40 Leu Gin Met Gin Ala Ala Val Pro Glu Ala His Gly Glu Asp Arg His 
120 125 130 135 

tct get cct ctg tgc ttt egg tgt gtc cca ggg ccc tgc cca gtc cca 
Ser Ala Pro Leu Cys Phe Arg Cys Val Pro Gly Pro Cys Pro Val Pro 
140 145 150 

45 ggt gga ggt ate cct ggg ccc tgg cac tgattatagg acactgggca 
Gly Gly Gly lie Pro Gly Pro Trp His 
155 160 
agacactgea ctgccacgtg actcagtttc cccatctgcc tgatgggtgt tgctgtgaga 
attatgaaat gaaatgatga ccatgaaaat attgtagaag ccaagaaatg cttcagaagt 
50 tataaagctc tccccaaacc gtgttaaaaa aaaaaaaaaa 



60 
114 



162 



210 



258 



306 



354 



402 



450 



498 



546 



593 



653 
713 
753 



<210> 187 
<211> 754 
<212> DNA 
55 <213> Homo sapiens 



60 



<220> 
<221> CDS 
<222> 181. 



.462 



<400> 187 

atcctatcaa aagttacggt gaagtcaggg tgggtggcga gtccctgcaa ggtcgcccct 
ctgtgccaac acagectgat ggcttcttgt ttcaggaaac atccagaatt acaactggcc 

194 



60 
120 
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10 



15 



20 



25 



30 



attgagttat tacatatcaa ttgaacaagg tagttttaaa 
atg aat aaa gag ata gac tct ttg aat ctg gca 
Met Asn Lys Glu lie Asp Ser Leu Asn Leu Ala 
1 5 10 

ctt ctt cct get ttc ctg gac aca ccg tgg aca 
Leu Leu Pro Ala Phe Leu Asp Thr Pro Trp Thr 

20 25 
gga ttc atg gta agg tec cga gtg ctt ctg ata 
Gly Phe Met Val Arg Ser Arg Val Leu Leu lie 

35 40 
ccc cgc tea tct cag gag tec cga gga cac teg 
Pro Arg Ser Ser Gin Glu Ser Arg Gly His Ser 

50 55 
tec gec etc cat aag cct ggg ggc ate tgc cct 
Ser Ala Leu His Lys Pro Gly Gly lie Cys Pro 
65 70 75 

age cac etc ctt gtc tgg gaa cag cca age etc 
Ser His Leu Leu Val Trp Glu Gin Pro Ser Leu 

85 90 
tgaggattct tgtggattgt tctttctgta actggacagc 
tagctctgtg ccttgctggg gtctgaggtt cacaggtcag 
ecaattgegg cgtgaattcc ttcatcctca ccagtagctt 
caegtgetta gtagggagag aggcctacca aggttgecat 
tccccaaccc ccctgcaaat tatatattga agtccccaaa 

<210> 188 
<211> 998 
<212> DNA 

<213> Homo sapiens 



atgaaagaaa atettgeaac 180 
tac age ttt ccc ttc 228 
Tyr Ser Phe Pro Phe 
15 

gac cca ttt ccc tct 276 
Asp Pro Phe Pro Ser 
30 

cag ctg ctg age aga 324 
Gin Leu Leu Ser Arg 
45 

ctt ccc tgc age ccg 372 

Leu Pro Cys Ser Pro 

60 

gca gca ctg ggg agg 420 
Ala Ala Leu Gly Arg 
80 

cgt gac age 462 
Arg Asp Ser 

acatceggaa ttccttgcca 522 
atgctgctgt ctggtccttc 582 
cttgctctcc ccaagggagg 642 
ctgccatggg ctcaattgtg 702 
aaaaaaaaaa aa 754 



<220> 
<221> CDS 
<222> 



290 



35 <220> 

<221> misc_f eature 
<222> 871 

<223> n=a, g, c or t 
40 <400> 188 

gattc atg aag gee teg ggt cct gac etc tct gat gga etc cac tgc ccc 
Met Lys Ala Ser Gly Pro Asp Leu Ser Asp Gly Leu His Cys Pro 
15 10 15 

agt eta att aga cat tta aga ace ttc tct gca get get gee tta gee 
45 Ser Leu lie Arg His Leu Arg Thr Phe Ser Ala Ala Ala Ala Leu Ala 

20 25 30 

cca aga tac cca ace aga ctt ccc agt tea ctg ctt eta tgg cac etc 
Pro Arg Tyr Pro Thr Arg Leu Pro Ser Ser Leu Leu Leu Trp His Leu 
35 40 45 

50 tgc cag tgc etc cat etc etc tat gca gtt tct ace tea tgc aac age 
Cys Gin Cys Leu His Leu Leu Tyr Ala Val Ser Thr Ser Cys Asn Ser 

50 55 60 

cat ggg aag aga teg get gee tgg gca atg acc aga aca gaa gac aca 
His Gly Lys Arg Ser Ala Ala Trp Ala Met Thr Arg Thr Glu Asp Thr 
55 65 70 75 

gat gcg eta aca gat tec ttc gat gac agt ttc ate agt tct gca gat 
Asp Ala Leu Thr Asp Ser Phe Asp Asp Ser Phe lie Ser Ser Ala Asp 
80 85 90 95 

taaagacttt caccagaaaa aaaaattacc tgattttgee ctgaggcagc cagggagggc 
60 tttgtccttg acaatcccac tgacttattt aacaggtagc tcaaaaccca acaaaaactg 
gaggaggctg ctccactgca gggatggttt caatteggta actggagtat tgtactctcc 
ttgcaccctg gctcatcccc acaaaagacc tttcaaagaa aacacttaat tacctccttg 
cacaagccct gtaageccta aggtgaaaag aaactcagca gacaaggtcc acagagaagg 

195 



50 



98 



146 



194 



242 



290 



350 
410 
470 
530 
590 
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agaaggcaca attcagtagg gacctacgct cagcaccagg ataaagaaac tgtccattcc 650 

tgccacctcc taggaagcta aaagaattaa ggggaggccg ggcacggtgg ctcacgcctg 710 

taatcccagc actttgggag gccgaggcgg gtggatcatg aggtcaggag atcgagacca 770 

tcctggctaa catggtgaaa ccccatctct actaaaaata caaaaaatta gccgggcgtg 83 0 

gtggcgggcg ccctgtagtc ccagctactc gggaggctga nggcaggaga atggtgtgaa 890 

cctgggaggc ggagcttgca gtgagccgag attgcgccct gctccactcc agcctgagcg 950 

acagagcgag actccgtctc aaaaaaaaaa argaaaaaaa aaaaaaaa 998 



<210> 189 
10 <211> 605 
<212> DNA 

<213> Homo sapiens 

<220> 
15 <221> CDS 

<222> 115 . .411 



<400> 189 

aagaaagggg tgaggcctaa gggacaatca ggatgttttt cagagagaag tgtggatgct 60 
20 ggacaggaag aaccacagat accagatacg ggtactgttg taactctgtt ctcc atg 117 

Met 
1 

aaa aaa aag gaa gaa aca aca ctt tea gag atg gag cct gtt gag cca 165 
Lys Lys Lys Glu Glu Thr Thr Leu Ser Glu Met Glu Pro Val Glu Pro 

25 " 5 10 15 

cag tac caa eta gtc aat get gaa teg act tct ccc ttt eta cat tgc 213 
Gin Tyr Gin Leu Val Asn Ala Glu Ser Thr Ser Pro Phe Leu His Cys 

20 25 30 

ctg aga gaa gtc att ggg gaa tac tct gta cac gaa ttt tea ctg ttg 261 

30 Leu Arg Glu Val lie Gly Glu Tyr Ser Val His Glu Phe Ser Leu Leu 
35 40 45 

ggg aaa aca gag agt caa ggg att gga ttg tgg att gca ttg gtg gtt 309 
Gly Lys Thr Glu Ser Gin Gly lie Gly Leu Trp lie Ala Leu Val Val 
50 55 60 65 

35 ttc etc agt ttc etc ate ttc tec aca agt ttc tac ata teg aat gca 357 
Phe Leu Ser Phe Leu lie Phe Ser Thr Ser Phe Tyr lie Ser Asn Ala 

70 75 80 

gag cag ccc ttc ttc aaa gaa cct cct acg gaa get get aag gaa etc 405 
Glu Gin Pro Phe Phe Lys Glu Pro Pro Thr Glu Ala Ala Lys Glu Leu 

40 85 90 95 

agt ctg tagctctgeg tggagccatg tgtaaacact gaactgagac ctgccacctc 461 
Ser Leu 

ctactaccta agggeccatt ttcatctgat atcatccccc agaaacaaac tcatgatgac 521 
ttccatgttt tttttagatt agatacatgg agaattttcc tttcccttag aattaaaatc 581 
45 ctgeattcta aaaaaaaaaa aaaa 605 



<210> 190 
<211> 526 
<212> DNA 
50 <213> Homo sapiens 



<220> 
<221> CDS 
<222> 3 . .368 

55 

<400> 190 

ag ate cga gcg ace atg gtg gee egg gtg tgg teg ctg atg agg ttc 47 
lie Arg Ala Thr Met Val Ala Arg Val Trp Ser Leu Met Arg Phe 
15 10 15 

60 etc ate aag gga agt gtg get ggg ggc gec gtc tac ctg gtg tac gac 95 
Leu lie Lys Gly Ser Val Ala Gly Gly Ala Val Tyr Leu Val Tyr Asp 

20 25 30 

cag gag ctg ctg ggg ccc age gac aag age cag gca gee eta cag aag 143 

196 



BNSDOCID: <WO 01 42451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



15 



20 



Gin Glu Leu Leu Gly Pro Ser Asp 
35 

get ggg gag gtg gtc ccc ccc gec 
Ala Gly Glu Val Val Pro Pro Ala 



50 



55 



tgt cag cag aca ggc ctg cag ata 
Cys Gin Gin Thr Gly Leu Gin lie 

65 70 
att tac ttt ccc ate cgt gac tec 
10 lie Tyr Phe Pro lie Arg Asp Ser 



80 



85 



atg tea get ctg teg gtg gee ccc 
Met Ser Ala Leu Ser Val Ala Pro 
100 

gag ggc tgg gag tat gtg aag gcg 
Glu Gly Trp Glu Tyr Val Lys Ala 
115 

cctgccccgg ecagaaeggg cagggctgc 
actccgaggg cagctcccgg ccttgccgg 
ataaaaaaaa aaaaaaaa 



Lys Ser Gin 
40 

atg tac cag 

Met Tyr Gin 

ccc cag etc 
Pro Gin Leu 

tgg aat gca 
Trp Asn Ala 
90 

tec aag gee 
Ser Lys Ala 
105 

cgc ace aag 
Arg Thr Lys 
120 

c actgacctga 
c ccaataaagg 



Ala Ala Leu Gin Lys 
45 

ttc age cag tac gtg 
Phe Ser Gin Tyr Val 
60 

cca gec cct cca aag 
Pro Ala Pro Pro Lys 
75 

ggc ate atg acg gtg 
Gly lie Met Thr Val 
95 

cgc gag tac tec aag 
Arg Glu Tyr Ser Lys 
110 

tagegagtea geaggggecg 



agactcegga ctgggacccc 
acttcagaag tgaaaaaaaa 



191 



239 



287 



335 



388 



448 
508 
526 



<210> 191 
<211> 910 
<212> DNA 
25 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 174 . .527 



30 



35 



40 



45 



50 



55 



60 



<400> 191 

attttcctgt taggecaaga gagaagagga tccttcctca gagcctccag cctcccttga 
tcccttgctt gtgggcatat gtgggtcata tttccctccc atcaccctct gcacgccacc 
cccatcaccg ccacagaccc ccagcccttc agttgccctg cacctccttg gtg atg 

Met 
1 

cag ccg tec ttg tta agg tea tac agg ttg aag gee caa tta age ctg 
Gin Pro Ser Leu Leu Arg Ser Tyr Arg Leu Lys Ala Gin Leu Ser Leu 

5 10 15 

tea tct aca gtt ccc cga aga ate acg gac aaa cca gee aca aag tec 
Ser Ser Thr Val Pro Arg Arg lie Thr Asp Lys Pro Ala Thr Lys Ser 

20 25 30 

tgg gaa gga ggc agg agg gag ctg tgt cct egg gta etc ttc ace caa 
Trp Glu Gly Gly Arg Arg Glu Leu Cys Pro Arg Val Leu Phe Thr Gin 

35 40 45 

etc ctt etc tgg gtt tgg cct gga gat cct ggc cct gaa etc cag gaa 
Leu Leu Leu Trp Val Trp Pro Gly Asp Pro Gly Pro Glu Leu Gin Glu 
50 55 60 65 

aca ggc ttc cct ggc cca cct cgc cca get cac etc aaa act gac cga 
Thr Gly Phe Pro Gly Pro Pro Arg Pro Ala His Leu Lys Thr Asp Arg 

70 75 80 

gee ate atg gtt ggt gtc aaa ggc att gaa gag aaa agt ggc ata ggt 
Ala lie Met Val Gly Val Lys Gly lie Glu Glu Lys Ser Gly lie Gly 

85 90 95 

get gga gtc tgc agg gtg agt gtg gag aag ttg get tec aca cag gag 
Ala Gly Val Cys Arg Val Ser Val Glu Lys Leu Ala Ser Thr Gin Glu 

100 105 110 

agg act tec tec etc taaggagctc cccatacccc ccatcacctt ggcattccca 
Arg Thr Ser Ser Leu 
115 

gctcctccag aatccctccc tccctcagcc tagagaagga caactgcttc cccttgggcc 
ttgtcccctc acctccttga ggaaagaact gggagtaaat ctgcttgaag ttctcctcat 
tgacaattcc gctgggacat tcctggaagg agagggcacc aggctgaggg cagagacaaa 

197 



60 
120 
176 



224 



272 



320 



368 



416 



464 



512 



567 



627 
687 
747 
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40 



atccccttcc gttcaccgcc cccaccctcc atggcccaag actcccaggg agggggataa 807 
tcttcaagcc tccagaggac tcaccacgtg gctcatgtga tgggagggaa gacttctttc 867 
ccagtgcaca aataaaaaac atggaacgaa aaaaaaaaaa aaa 910 

5 <210> 192 
<211> 668 
<212> DNA 
<213> Homo sapiens 

10 <220> 

<221> CDS 
<222> 57 . .203 

<400> 192 

15 tcctgtcgac gtgttcttcc ggtggcggag cggcggatta gccttcgcgg ggcaaa atg 59 

Met 
1 

gag etc gag gec atg age aga tat ace age cca gtg aac cca cct gtc 107 
Glu Leu Glu Ala Met Ser Arg Tyr Thr Ser Pro Val Asn Pro Pro Val 

20 5 10 15 

ttc ccc cat ctg ace gtg gtg ctt ttg gee att ggc atg ttc ttc ace 155 
Phe Pro His Leu Thr Val Val Leu Leu Ala lie Gly Met Phe Phe Thr 

20 25 30 

gee tgg ttc ttc gtg tat cct ttc act gag cag cca gag gac cag cat 203 

25 Ala Trp Phe Phe Val Tyr Pro Phe Thr Glu Gin Pro Glu Asp Gin His 
35 40 45 

tagtgatgtg ggaagctcag ggagaaacca egctaggtae atggaccccg ccggttttgt 263 
acattggatt ggggctgaga gaagattgee gtgggctggg ctctctgcac tccacagtcc 323 
accccttcgc tttgccttaa ctgctgtgcc cagttacgag gtcacctcta ccaagtacac 383 

30 tegtgatate tataaagagc tcctcatctc attagtggcc tcactcttca tgggctttgg 443 
agtcctcttc ctgctgctct gggttggcat ctacgtgtga gcacccaagg gtaacaacca 503 
gatggcttca ctgaaacctg cttttgtaaa ttactttttt ttactgttgc tggaagtgtc 563 
ccacctgctg ctcataataa atgcagatgt atagcaaaaa aaaaaaaaaa aaaaaaaaaa 623 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaat aaaaaaaaaa aaaaa 668 

35 

<210> 193 
<211> 637 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 68 . . 334 

45 <400> 193 

agttatgaag ttctaaaagc aagtcttaat caggaagtgt ccttgatcac caacggctcg 60 
cccaggc atg ctg get etc ttc cac ttc cac ctt cca cca tgg gat gac 109 
Met Leu Ala Leu Phe His Phe His Leu Pro Pro Trp Asp Asp 
15 10 
50 gca gta aga agg cca tea gta gat gee agt ccc tea acc ttg aac ttt 157 
Ala Val Arg Arg Pro Ser Val Asp Ala Ser Pro Ser Thr Leu Asn Phe 
15 20 25 30 

cca gac gca gaa ctt tat gee tec att ttc etc tgc tgc atg gee cca 205 
Pro Asp Ala Glu Leu Tyr Ala Ser lie Phe Leu Cys Cys Met Ala Pro 
55 35 40 45 

gga gag att tta att age ttt eta acc ttg gtc cag att gca cat gca 253 
Gly Glu lie Leu lie Ser Phe Leu Thr Leu Val Gin lie Ala His Ala 

50 55 60 

aat ggt aga gga tgc aac acc ccc get tgt gga get gee get tgt gtc 301 
60 Asn Gly Arg Gly Cys Asn Thr Pro Ala Cys Gly Ala Ala Ala Cys Val 
65 70 75 

tgg cat gaa aat tea caa gaa gag agg aaa tac tgaggagaaa atggcagatt 354 
Trp His Glu Asn Ser Gin Glu Glu Arg Lys Tyr 

198 
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80 85 

gtgtttgctg aatttgattg acgaagaagt caccatgaaa atcacagtga accatttgga 414 

aagcaaactg ccaaaaaaat aatagttagt catgctctca ggctggttgt tttggctgtt 474 

gtgggtttct tgcatttcca gatgattgca aagagctgtt tctcaatttc tgcaacaagt 534 

5 gccagctgaa attttggtac cagtttcatt aaatatgtat aacaaaakaa aaaaaaaaaa 594 

aaaaaaaaaa aaaaaaaaaa aaaaaaagaa aaaaaaaaaa aaa 637 

<210> 194 
<211> 706 
10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 183 . .443 



<400> 194 

agaagttctc agagggtgag ggtcccacat ctcctgcagg acaggcccta gctaccgagt 
cacagaaacc cagggccgaa gcaaagtccc aatcccagag aggctggggc acacctacaa 
20 ctgaaaggag gcttagaaat ccttcagaga ccaccctatc ggttctcctc cacctggaca 
99 atg age cag caa cac aga agg aag agg cct tec tec gaa aga aaa 
Met Ser Gin Gin His Arg Arg Lys Arg Pro Ser Ser Glu Arg Lys 
1 5 10 15 

age aca aga aag atg gac aca tgg cag agt ctt aaa gtc aaa gaa gta 
25 Ser Thr Arg Lys Met Asp Thr Trp Gin Ser Leu Lys Val Lys Glu Val 

20 25 30 

ttc tgt aag cat aat tct tec tat gaa tgc ctt etc tat aaa gag gtt 
Phe Cys Lys His Asn Ser Ser Tyr Glu Cys Leu Leu Tyr Lys Glu Val 
35 40 45 

30 gaa gca aga cag gtt tct aag aca gee ace gat ggg tec tac etc etc 
Glu Ala Arg Gin Val Ser Lys Thr Ala Thr Asp Gly Ser Tyr Leu Leu 

50 55 60 

gta ttc aca tec tat gta ate tec tec cca gtg tgg act gga cct ggt 
Val Phe Thr Ser Tyr Val lie Ser Ser Pro Val Trp Thr Gly Pro Gly 
35 65 70 75 

gac ttg ctt cca gtg aat aga ata tagcaaaagt gattgatgtc acctccaaga 
Asp Leu Leu Pro Val Asn Arg lie 
80 85 

ttcagctata gaagactatg actatgactt tcctcttggc tagcattctc gctaaccctt 
40 cctgcttgct tgtactgagc tgccctatga agaggeccat gtagggtggc ctgggtgggg 
gtgatctgtg gccaacagcc agcaaggaac taaatcctgt ttacaaccac atgagcttgg 
aaggagatcc ttccccagta aagecaggag atgaatacaa aaaaaaaaaa aaa 



60 
120 
180 
227 



275 



323 



371 



419 



473 



533 
593 
653 
706 



<210> 195 
45 <211> 670 
<212> DNA 

<213> Homo sapiens 

<220> 
50 <221> CDS 

<222> 94 . .228 

<400> 195 

acttttcagg ggacattcag aggcatcagc cccttcctcc tcaccagctc ccagagttcc 
55 catctccatc cccaatccta aagaaggaaa teg atg cca egg tec tea agg age 

Met Pro Arg Ser Ser Arg Ser 
1 5 
cct ggg gac cca ggc gee eta etc gaa gat ggc cca caa tec cag ace 
Pro Gly Asp Pro Gly Ala Leu Leu Glu Asp Gly Pro Gin Ser Gin Thr 
60 10 15 2 0 

cc 9 9 a 9 9 at t9 c cca 9 C 9 a 99 cc 9 9 aa cac ca 9 ca 9 9 at 99 c a 9 a 99 a 
Pro Glu Asp Cys Pro Ala Arg Pro Glu His Gin Gin Asp Gly Arg Gly 
25 30 35 

199 



60 
114 



162 



210 
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cac etc ccc 
His Leu Pro 
40 

tgaccaagcc 
5 tgcctgctcc 
gctgtgccag 
ccagggccct 
cactgggcaa 
gctgtgagaa 
10 ttcagaagtt 



aaa cat gaa tgacaacatc ctgttgcctg tccgcaacaa 
Lys His Glu 
45 

etaggectga etcagtgeat gctgggatgt gtgtcctggt tcacctgttt 
ctgagaactc aggcccagca ggttctgttt aacaegtgea gatgeaaget 
aagctcatgg agaagacagg cattctgctc ctctgtgctt tcggtgtgtc 
gcccagtccc aggtggaagg tatccctggg ccctggcact gattatagga 
gacactgcac cgccacgtga ctcagtttcc ccatctgcct gatgggtgtt 
ttatgaaatg aaatgatgac catgaaaata ttgtagaagc caagaaatgc 
ataaagctct ccccaaaccg tgttatgaaa aaaaaaaaaa aa 



258 



318 
378 
438 
498 
558 
618 
670 



<210> 196 
<211> 510 
<212> DNA 
15 <213> Homo sapiens 



20 



<220> 
<221> CDS 
<222> 133 . . 327 



<400> 196 

aacctcaagg agccctgttg tgctaccgac tgeagagetc atggacatcc atcaggaagc 60 
ctccaatacc caaaccaggg gtagttgcct aatccatatc catgtggata gctctttact 120 
taggaaacct tg atg get tat ttg gat gac aaa ggt tec ctt ttg gcg ata 171 

25 Met Ala Tyr Leu Asp Asp Lys Gly Ser Leu Leu Ala He 

15 10 
cat age cat gcg aga caa cat age cat gaa aca aac caa gtc cac cag 219 
His Ser His Ala Arg Gin His Ser His Glu Thr Asn Gin Val His Gin 
15 20 25 

30 tgg ctt cct agg aac aca ttt get ttc ctg ata aaa gag gac aga tgc 267 
Trp Leu Pro Arg Asn Thr Phe Ala Phe Leu He Lys Glu Asp Arg Cys 
30 35 40 45 

agt tgc aga agt acc tgt gee tct ttt tct ttt tct tct tct ttt tct 315 
Ser Cys Arg Ser Thr Cys Ala Ser Phe Ser Phe Ser Ser Ser Phe Ser 

35 50 55 60 

ttt tta ate tct taaatgeaga tataagaact ggtactgaag cagccatctt 367 
Phe Leu He Ser 
65 

gtgaccataa ggaagaagee aagaacatca gaaccagtgg cctagccatt gcacagtcat 427 
40 ctaaacacac ctctggactt gttattatgt aaaaaaaaat aaacacctgc tcttgttatt 487 
tgcaatccaa aaaaaaaaaa aaa 510 



<210> 197 

<211> 500 

45 <212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
50 <222> 22 . . 357 



<400> 197 



atagaatata cacaaaagga a 


atg 


aga 


aag 


aaa 


tgt 


aaa 


tgc 


ttc 


act 


ata 


51 














Met 
1 


Arg 


Lys 


Lys 


Cys 
5 


Lys 


Cys 


Phe 


Thr 


He 
10 




aaa 


aaa 


aca 


aat 


aca 


tac 


gaa 


gaa 


agt 


aat 


gca 


gga 


aat 


gaa 


gga 


caa 


99 


Lys 


Lys 


Thr 


Asn 


Thr 


Tyr 


Glu 


Glu 


Ser 


Asn 


Ala 


Gly 


Asn 


Glu 


Gly 


Gin 












15 










20 










25 






aaa 


gaa 


get 


ata 


age 


att 


tgt 


att 


tgc 


aga 


aga 


gat 


ggt 


tta 


ctt 


cct 


147 


Lys 


Glu 


Ala 


He 


Ser 


He 


Cys 


He 


Cys 


Arg 


Arg 


Asp 


Gly 


Leu 


Leu 


Pro 










30 










35 










40 








ctg 


tgg 


gta 


acc 


agg 


tta 


tea 


gat 


ttg 


gtg 


ttt 


tec 


aaa 


gaa 


aag 


gca 


195 


Leu 


Trp 


Val 


Thr 


Arg 


Leu 


Ser 


Asp 


Leu 


Val 


Phe 


Ser 


Lys 


Glu 


Lys 


Ala 





200 
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10 



45 

cat ggc atg 
His Gly Met 
60 

aaa gag atg 
Lys Glu Met 
75 

aga gat gcc 
Arg Asp Ala 

gga cct tgt 
Gly Pro Cys 



gaatctcact 
15 gaactcttca 



50 55 
att cca ctt ctt ggc tec cat agg gaa aag aag aca agt 
lie Pro Leu Leu Gly Ser His Arg Glu Lys Lys Thr Ser 

65 70 
aag act tct tec agg aac ctg agg tac ttc att gtc tgc 
Lys Thr Ser Ser Arg Asn Leu Arg Tyr Phe lie Val Cys 

80 85 90 

tea tec tac ace cct cag tea etc ata tct gga tac att 
Ser Ser Tyr Thr Pro Gin Ser Leu lie Ser Gly Tyr He 

95 100 105 

caa cat caa taatggacat acctctgata tttgaactct 
Gin His Gin 
110 

ctgtgaccac aactttgtat ctttctaagt ctttaatctt caacctcaca 
taccctaaaa tatagtattt tcacctggaa aaaaaaaaaa aaa 



243 



291 



339 



387 



447 
500 



<210> 198 

<211> 667 

<212> DNA 

20 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 4 . . 333 



25 



30 



35 



40 



45 



50 



<400> 198 

aaa atg gtg ttt gga gcc atg gtc ctt ctt gtg 
Met Val Phe Gly Ala Met Val Leu Leu Val 
15 10 
ace aat ate cgc aac gtg gag aga ctg aag aag 
Thr Asn He Arg Asn Val Glu Arg Leu Lys Lys 

20 25 
tat tgc etc ate gac age ttc ctg ggg gac teg 
Tyr Cys Leu He Asp Ser Phe Leu Gly Asp Ser 

35 40 
ctg ace cag tgt gtg gac tgc gtg att cct cca 
Leu Thr Gin Cys Val Asp Cys Val He Pro Pro 

50 55 
cag ate tct age tac etc tac tta aat act get 
Gin He Ser Ser Tyr Leu Tyr Leu Asn Thr Ala 

65 70 
ggt gtg gcg gcc tec cag gca tgt gac tct cag 
Gly Val Ala Ala Ser Gin Ala Cys Asp Ser Gin 
80 85 90 

etc tac gtt get aat ggt gcc tac teg gca tgt 
Leu Tyr Val Ala Asn Gly Ala Tyr Ser Ala Cys 

100 105 
tgaaeggtag ctgctgcggt tacattatta gcttcagttt 
taatcagatt tcacagactt cacagtgtga gttggggatg 
ggaactcagg ctcagagagg gtgagacgta ggagcatggc 
ggctgtgggt ttctccccat tccctgccca tctgggaagt 
gtctgetgae tcccagtccc cctaaccccc cagaatgtaa 
ataaaaatac aaaaggcega aaaaaaaaaa aaaa 



55 <210> 199 

<211> 514 

<212> DNA 

<213 > Homo sapiens 



gga ctt gaa gaa ctg 
Gly Leu Glu Glu Leu 
15 

gac ttg agg gcc agt 
Asp Leu Arg Ala Ser 
30 

gag etc ate ggg gac 
Glu Leu He Gly Asp 
45 

gag ggg tec etc ttg 
Glu Gly Ser Leu Leu 
60 

ctt gtg gac ttg cct 
Leu Val Asp Leu Pro 
75 

cag gtg act tgg ctt 
Gin Val Thr Trp Leu 
95 

aac agg cct gga 
Asn Arg Pro Gly 
110 

gcccgcccag gctagatgtt 
tgacttcgta tgaaagtgaa 
cactgcgcga getegggget 
cgctgccacc ccctacgctt 
acagcagcag atgaacaaaa 



60 <220> 

<221> CDS 
<222> 1 . . 363 



48 



96 



144 



192 



240 



288 



333 



393 
453 
513 
573 
633 
667 



201 



BNSDOCID: <WO 0142451 A2_l_> 



WO 01/42451 



PCT/IB00/01938 



<400> 199 

acg agt tct tec ggg gcg gag gtc acc atg gca get gec ttg get egg 48 
Thr Ser Ser Ser Gly Ala Glu Val Thr Met Ala Ala Ala Leu Ala Arg 
15 10 15 

5 ctt ggt ctg egg cct gtc aaa cag gtt egg gtt cag ttc tgt ccc ttc 96 
Leu Gly Leu Arg Pro Val Lys Gin Val Arg Val Gin Phe Cys Pro Phe 

20 25 30 

gag aaa aac gtg gaa teg acg agg acc ttc ctg cag acg gtg age agt 144 
Glu Lys Asn Val Glu Ser Thr Arg Thr Phe Leu Gin Thr Val Ser Ser 

10 35 40 45 

gag aag gtc cgc tec act aat etc aac tgc tea gtg att gcg gac gtg 192 
Glu Lys Val Arg Ser Thr Asn Leu Asn Cys Ser Val He Ala Asp Val 

50 55 60 

agg cat gac ggc tec gag ccc tgc gtg gac gtg ctg ttc gga gac ggg 240 

15 Arg His Asp Gly Ser Glu Pro Cys Val Asp Val Leu Phe Gly Asp Gly 
65 70 75 80 

cat cgc ctg att atg cgc ggc get cat etc acc get ctg gaa atg etc 288 
His Arg Leu He Met Arg Gly Ala His Leu Thr Ala Leu Glu Met Leu 
85 90 95 

20 acc gee ttc gee tec cac ate egg gee agg gac gcg gcg ggc age ggg 336 
Thr Ala Phe Ala Ser His He Arg Ala Arg Asp Ala Ala Gly Ser Gly 

100 105 110 

gac aag ccg ggc get gat act ggt cgc tgacagcgcc aaagagacca 383 
Asp Lys Pro Gly Ala Asp Thr Gly Arg 

25 115 120 

acaagatgat ttgcgtggac taggacactt aacctaagaa gagtttcact taatcattca 443 
aatcactatc tgaagggtca eggagegcaa aataaagttt aaaaccctgc taccaaaaaa 503 
aaaaaaaaaa a 514 

30 <210> 200 

<211> 462 

<212> DNA 

<213> Homo sapiens 

35 <220> 

<221> CDS 
<222> 41 . .337 

<400> 200 

40 cttcaccacc aaaactctcc actccaccag cacagccaaa atg etc gca cgt get 55 

Met Leu Ala Arg Ala 
1 5 

act ttc cgc gee gee teg gee cca act etc gtc gee cgc cgc ggc ttc 103 
Thr Phe Arg Ala Ala Ser Ala Pro Thr Leu Val Ala Arg Arg Gly Phe 

45 10 15 20 

cag teg acc cgc gcg caa atg gee age cca tac cac tac ccc gag ggt 151 
Gin Ser Thr Arg Ala Gin Met Ala Ser Pro Tyr His Tyr Pro Glu Gly 

25 30 35 

cct cgc age aac ttg cca ttc gac ccg ctg aag aag ggc ttt get ttc 199 

50 Pro Arg Ser Asn Leu Pro Phe Asp Pro Leu Lys Lys Gly Phe Ala Phe 
40 45 50 

aag tac tgg ggc ttt atg ggc acc gga ttc gee ctt ccc ttc etc ctt 247 
Lys Tyr Trp Gly Phe Met Gly Thr Gly Phe Ala Leu Pro Phe Leu Leu 
55 60 65 

55 get gtc tgg caa aca gaa caa gee gta aat gcg ctg aga cac ggc gtg 295 
Ala Val Trp Gin Thr Glu Gin Ala Val Asn Ala Leu Arg His Gly Val 
70 75 80 85 

gac atg cgt ate ggg ate ccg ggg aac acg gca ttt gta gat 337 
Asp Met Arg He Gly He Pro Gly Asn Thr Ala Phe Val Asp 

60 90 95 

taggtggagg gcccgcatac ggctatacta gacatcacag catcaatttc attgtctgtc 397 
cccaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 457 
aaaaa 462 

202 
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<210> 201 
<211> 551 
<212> DNA 
5 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 1 . . 549 

10 

<400> 201 

aga gag gga gcc cga gcc agg cca tct cca acc atg tec gac gag gec 48 
Arg Glu Gly Ala Arg Ala Arg Pro Ser Pro Thr Met Ser Asp Glu Ala 
15 10 15 

15 teg gcc ate act tec tac gag aag ttt eta acc ccc gag gag ccc ttc 96 
Ser Ala lie Thr Ser Tyr Glu Lys Phe Leu Thr Pro Glu Glu Pro Phe 

20 25 30 

cca etc ctg gga cct cct cgc ggg gtg ggc acc tgc ccg age gag gag 144 
Pro Leu Leu Gly Pro Pro Arg Gly Val Gly Thr Cys Pro Ser Glu Glu 

20 35 40 45 

ccg ggc tgc ctg gac ate age gac ttc ggc tgc cag ctg tec tec tgc 192 
Pro Gly Cys Leu Asp lie Ser Asp Phe Gly Cys Gin Leu Ser Ser Cys 

50 55 60 

cat cgc acc gac ccg etc cac cgc ttc cac acc aac agg tgg aac eta 240 

25 His Arg Thr Asp Pro Leu His Arg Phe His Thr Asn Arg Trp Asn Leu 
65 70 75 80 

act tct tgt gga aca agt gtt gcc age tea gaa ggc agt gag gag ctg 288 
Thr Ser Cys Gly Thr Ser Val Ala Ser Ser Glu Gly Ser Glu Glu Leu 
85 90 95 

30 ttt tea tct gtg tct gtt gga gat caa gat gat tgc tat tec ctg tta 336 
Phe Ser Ser Val Ser Val Gly Asp Gin Asp Asp Cys Tyr Ser Leu Leu 

100 105 110 

gat gat cag gac ttc act tct ttt gat tta ttt cct gag ggg agt gtc 384 
Asp Asp Gin Asp Phe Thr Ser Phe Asp Leu Phe Pro Glu Gly Ser Val 

35 115 120 125 

tgc agt gat gtc tct tct tct att age act tac tgg gat tgg tea gat 432 
Cys Ser Asp Val Ser Ser Ser lie Ser Thr Tyr Trp Asp Trp Ser Asp 

130 135 140 

age gag ttt gaa tgg cag tta cca ggc agt gac att gcc agt ggg agt 480 

40 Ser Glu Phe Glu Trp Gin Leu Pro Gly Ser Asp lie Ala Ser Gly Ser 
145 150 155 160 

gat gta ctt tct gat gtc ata ccc agt att cca agt tea cct tgc ctg 528 
Asp Val Leu Ser Asp Val lie Pro Ser He Pro Ser Ser Pro Cys Leu 
165 170 175 

45 ctt cct aaa aaa aaa aaa aaa aa 551 
Leu Pro Lys Lys Lys Lys Lys 
180 

<210> 202 
50 <211> 550 
<212> DNA 
<213> Homo sapiens 

<220> 
55 <221> CDS 

<222> 34 . .315 

<220> 

<221> misc_feature 
60 <222> 483 

<223> n=a, g , c or t 

<400> 202 

203 
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agagagggag cccgagccag gccatctcca acc atg tec gac gag gec teg gee 54 

Met Ser Asp Glu Ala Ser Ala 
1 5 

ate act tec tac gag aag ttt eta acc ccc gag gag ccc ttc cca etc 102 
5 lie Thr Ser Tyr Glu Lys Phe Leu Thr Pro Glu Glu Pro Phe Pro Leu 
10 15 20 

ctg gga cct cct cgc ggg gtg ggc acc tgc ccg age gag gag ccg ggc 150 
Leu Gly Pro Pro Arg Gly Val Gly Thr Cys Pro Ser Glu Glu Pro Gly 
25 30 35 

10 tgc ctg gac ate age gac ttc ggc tgc cag ctg tec tec tgc cat cgc 198 
Cys Leu Asp He Ser Asp Phe Gly Cys Gin Leu Ser Ser Cys His Arg 
40 45 50 55 

acc gac ccg etc cac cgc ttc cac acc aac agg tgg aac eta act tct 246 
Thr Asp Pro Leu His Arg Phe His Thr Asn Arg Trp Asn Leu Thr Ser 
15 60 65 70 

tgt gga aca agt gtt gee age tea gaa ggc agt gag gag ctg ttt tea 294 
Cys Gly Thr Ser Val Ala Ser Ser Glu Gly Ser Glu Glu Leu Phe Ser 

75 80 85 

tct gtc tgt tgg aga tea aga tgattgetat tccctgttag atgatcagga 345 
20 Ser Val Cys Trp Arg Ser Arg 
90 

cttcacttct tttgatttat ttcctgaggg gagtgtctgc agtgatgtct cttcttctat 405 
tagcacttac tgggattggt cagatagega gtttgaatgg cagttaccag gcagtgacat 465 
tgccagtggg agtgatgnta ctttctgatg tcatacccag tattccaagt tcaccttgcc 525 
25 tgcttcctaa aaaaaaaaaa aaaaa 550 

<210> 203 

<211> 408 

<212> DNA 

30 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 1. .315 

35 

<400> 203 

ate ggg ccg cga gcg ccc tec ccg teg ttt tec gtg aga gac gta gag 48 
He Gly Pro Arg Ala Pro Ser Pro Ser Phe Ser Val Arg Asp Val Glu 
15 10 15 

40 ctg age gac cca gee cgc gag cga ggt gag atg ccg gtg gee gtg ggt 96 
Leu Ser Asp Pro Ala Arg Glu Arg Gly Glu Met Pro Val Ala Val Gly 

20 ~ 25 30 

ccc tac gga cag tec cag cca age tgc ttc gac cgt gtc aaa atg ggc 144 
Pro Tyr Gly Gin Ser Gin Pro Ser Cys Phe Asp Arg Val Lys Met Gly 

45 35 40 45 

ttc gtg atg ggt tgc gee gtg ggc atg gcg gee ggg gcg etc ttc ggc 192 
Phe Val Met Gly Cys Ala Val Gly Met Ala Ala Gly Ala Leu Phe Gly 

50 55 60 

acc ttt tec tgt etc agg ate gga atg egg ggt cga gag ctg atg ggc 240 

50 Thr Phe Ser Cys Leu Arg He Gly Met Arg Gly Arg Glu Leu Met Gly 
65 70 75 80 

ggc att ggg aaa acc atg atg cag agt ggc ggc acc ttt ggc aca ttc 288 
Gly He Gly Lys Thr Met Met Gin Ser Gly Gly Thr Phe Gly Thr Phe 
85 90 95 

55 atg gee att ggg atg ggc ate cga tgc taaccatggt tgccaactac 335 
Met Ala He Gly Met Gly He Arg Cys 
100 105 
atctgtccct tcccatcaat cccagcccat gtactaataa aagaaagtct ttgagcaaaa 395 
aaaaaaaaaa aaa 408 



60 



<210> 204 
<211> 665 
<212> DNA 



204 
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<213> Homo sapiens 

<220> 
<221> CDS 
5 <222> 94. .582 

<400> 204 

acttttcagg ggacattcag aggcatcagc cccttcctcc tcaccagctc ccagagttcc 60 
catctccatc cccaatccta aagaaggaaa teg atg cca egg tec tea agg age 114 
10 Met Pro Arg Ser Ser Arg Ser 

1 5 

cct ggg gac cca ggc gee eta etc gaa gat gtg gee cac aat ccc aga 162 
Pro Gly Asp Pro Gly Ala Leu Leu Glu Asp Val Ala His Asn Pro Arg 
10 15 20 

15 ccc egg agg att gee cag cga ggc egg aac ace age agg atg gca gag 210 
Pro Arg Arg lie Ala Gin Arg Gly Arg Asn Thr Ser Arg Met Ala Glu 

25 30 35 

gac ace tec cca aac atg aat gac aac ate ctg ttg cct gtc cgc aac 258 
Asp Thr Ser Pro Asn Met Asn Asp Asn lie Leu Leu Pro Val Arg Asn 
20 40 45 50 55 

aat gac caa gee eta ggc ctg act cag tgc atg ctg gga tgt gtg tec 306 
Asn Asp Gin Ala Leu Gly Leu Thr Gin Cys Met Leu Gly Cys Val Ser 

60 65 70 

tgg ttc acc tgt ttt gee tgc tec ctg aga act cag gee cag cag gtt 354 
25 Trp Phe Thr Cys Phe Ala Cys Ser Leu Arg Thr Gin Ala Gin Gin Val 
75 80 85 

ctg ttt aac acg tgc aga tgc aag ctg ctg tgc cag aag etc atg gag 402 
Leu Phe Asn Thr Cys Arg Cys Lys Leu Leu Cys Gin Lys Leu Met Glu 
90 95 100 

30 aag aca ggc att ctg etc etc tgt get ttc ggt gtg tec cag ggc cct 450 
Lys Thr Gly lie Leu Leu Leu Cys Ala Phe Gly Val Ser Gin Gly Pro 

105 110 115 

gee cag tec cag gtg gag gta tec ctg ggc cct ggc act gat tat agg 4 98 

Ala Gin Ser Gin Val Glu Val Ser Leu Gly Pro Gly Thr Asp Tyr Arg 
35 120 125 130 135 

aca ctg ggc aag aca ctg cac tgc cac gtg act cag ttt ccc cat ctg 546 
Thr Leu Gly Lys Thr Leu His Cys His Val Thr Gin Phe Pro His Leu 

140 145 150 

cct gat ggg tgt tgc tgt gag aat tat gaa atg aaa tgatgaccat 592 
40 Pro Asp Gly Cys Cys Cys Glu Asn Tyr Glu Met Lys 
155 160 
gaaaatattg tagaagecaa gaaatgette agaagttata aagctctccc caaaccgcaa 652 
aaaaaaaaaa aaa 665 

45 <210> 205 
<211> 1008 
<212> DNA 

<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 540 . - 923 

<400> 205 

55 atttaggtga gctgccacgt ceggaggagg gcagcaagaa tgaaagacct ctagttttcc 60 

agactcccgg agccctggtc tctacaccac atggacgtta tccacctcct ctgtgtcctc 120 

ccaaggcagc atttcagaag gtgatccacg geaaagcegt cccttcaaat ccgtctttgt 180 

gcccactgcc atagtcaacc ccgtgagaag cacagccggc cctgggactt taggacaagg 240 

gtctcttegg aaagggcgga gcagcatgag aaagagtaag tggtggcaga gagatggatc 300 

60 ectgeagaga cccctccagt ccgggatccc cactctcgtg gtaggctccc teagaegcag 360 

ccccaccatg gtccttcggc ctcagcagtt ccaattctac cagccacagg ggatcacctc 420 

ctccccctca gccgtggtgg tggagatggg gtccaagcct gccctcacgg gggagcccgc 480 

cctcacgtgc atcagcaggg gcagtgaggc ggatccactc cgcggccagc tccctcatt 539 

205 
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atg gaa gac aaa gaa ate ccc ate aag agt gag cct ctg cca aaa ccg 587 
Met Glu Asp Lys Glu lie Pro lie Lys Ser Glu Pro Leu Pro Lys Pro 
1 5 10 15 

ccc gca tct gee cca cca tec ate ctg gtg aaa cca gaa aac tea aga 635 
5 Pro Ala Ser Ala Pro Pro Ser lie Leu Val Lys Pro Glu Asn Ser Arg 
20 25 30 

aat gga ate gaa aag caa gtc aaa acc gtg aga ttt cag aat tac age 683 
Asn Gly lie Glu Lys Gin Val Lys Thr Val Arg Phe Gin Asn Tyr Ser 
35 40 45 

10 cct cct ccc acc aaa cat tac acc tec cat ccc acc tec gga aag cct 731 
Pro Pro Pro Thr Lys His Tyr Thr Ser His Pro Thr Ser Gly Lys Pro 

50 55 60 

gaa cag cca gee acc etc aag gcg tec cag cct gaa gca gcg tec ttg 779 
Glu Gin Pro Ala Thr Leu Lys Ala Ser Gin Pro Glu Ala Ala Ser Leu 
15 65 70 75 80 

ggc cca gag atg acc gtc eta ttt gee cac cga agt ggc tgc cac tec 827 
Gly Pro Glu Met Thr Val Leu Phe Ala His Arg Ser Gly Cys His Ser 

85 90 95 

gga cag cag aca gac etc egg aga aag tea get ctt gee aag gee aca 875 
20 Gly Gin Gin Thr Asp Leu Arg Arg Lys Ser Ala Leu Ala Lys Ala Thr 
100 105 110 

acc ctg gtg tec act gee tea ggc acg cag acc gtg ttt ccc age aaa 923 
Thr Leu Val Ser Thr Ala Ser Gly Thr Gin Thr Val Phe Pro Ser Lys 
115 120 125 

25 tgaacctacg ggtggctttt cctagacccc aaagaggtga attgeattta aatacagtct 983 
gcctycactg aaaaaaaaaa aaaaa 1008 

<210> 206 

<211> 455 

30 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
35 <222> 77. .364 

<400> 206 

tggacaaatg gaectgeggt aggagagagg gacaacagta ggagcaggca gatcttgetg 60 
tttcaaccaa aacctc atg ctg acc aga gtt gag gaa cag aag aag atg gtg 112 
40 Met Leu Thr Arg Val Glu Glu Gin Lys Lys Met Val 

1 5 10 

aag gee tgc agg tat agg tgt tea gca tgt cat ctg aaa tat tec cca 160 
Lys Ala Cys Arg Tyr Arg Cys Ser Ala Cys His Leu Lys Tyr Ser Pro 
15 20 25 

45 cag agg caa aaa gaa agg aaa tta tct ctg aaa agg ggg agg aca agt 208 
Gin Arg Gin Lys Glu Arg Lys Leu Ser Leu Lys Arg Gly Arg Thr Ser 

30 35 40 

cag cag aat atg tea atg ttt tgg ttg aag aag ctg ctt gaa tct ggg 256 
Gin Gin Asn Met Ser Met Phe Trp Leu Lys Lys Leu Leu Glu Ser Gly 
50 45 50 55 60 

ctt ttc tgt gee atg tgt tct ccc agg gee age aca aag aag ggc ttt 304 
Leu Phe Cys Ala Met Cys Ser Pro Arg Ala Ser Thr Lys Lys Gly Phe 

65 70 75 

tgg tgc agg ccc aag acc acc ata ate ate att gat tat tec tct cca 352 
55 Trp Cys Arg Pro Lys Thr Thr lie lie He He Asp Tyr Ser Ser Pro 
80 85 90 

cgc cag tgt etc taaataaact ttctcttctt tctctgaaaa aaaaaaaaaa 404 
Arg Gin Cys Leu 
95 

60 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaagaaaa aaaaaaaaaa a 455 

<210> 207 
<211> 749 

206 
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<212> DNA 

<213> Homo sapiens 

<220> 
5 <221> CDS 

<222> 65. . 544 

<400> 207 

cttttacgac gcgccggaaa gcaacggcaa gggccgcagc cagcaccggg cggagagggc 60 
10 tacc atg ggg aaa ate gcg ctg caa etc aaa gee acg ctg gag aac ate 109 
Met Gly Lys lie Ala Leu Gin Leu Lys Ala Thr Leu Glu Asn lie 
15 10 15 

ace aac etc egg ccc gtg ggc gag gac ttc egg tgg tac ctg aag atg 157 
Thr Asn Leu Arg Pro Val Gly Glu Asp Phe Arg Trp Tyr Leu Lys Met 
15 20 25 30 

aaa tgt ggc aac tgt ggt gag att teg gac aag tgg cag tac ate egg 205 
Lys Cys Gly Asn Cys Gly Glu lie Ser Asp Lys Trp Gin Tyr lie Arg 

35 40 45 

ctg atg gac agt gtg gca ctg aag ggg ggc cgt ggc agt get tec atg 253 
20 Leu Met Asp Ser Val Ala Leu Lys Gly Gly Arg Gly Ser Ala Ser Met 
50 55 60 

gtc cag aag tgc aag ctg tgt gca aga gaa aat tec ate gag att tta 301 
Val Gin Lys Cys Lys Leu Cys Ala Arg Glu Asn Ser lie Glu lie Leu 
65 70 75 

25 age age ace ate aag cct tac aat get gaa gac aat gag aac ttc aag 349 
Ser Ser Thr lie Lys Pro Tyr Asn Ala Glu Asp Asn Glu Asn Phe Lys 
80 85 90 95 

aca ata gtg gag ttt gag tgc egg ggc ctt gaa cca gtt gat ttc cag 3 97 

Thr lie Val Glu Phe Glu Cys Arg Gly Leu Glu Pro Val Asp Phe Gin 
30 100 105 110 

ccg cas gwg rtw ttg ctg ctg aag gtg tgg agt cag gga cag cct tea 445 
Pro Xaa Xaa Xaa Leu Leu Leu Lys Val Trp Ser Gin Gly Gin Pro Ser 

115 120 125 

gtg aca tta ate tgc agg aga agg act ggg act gac tat gat gaa aag 4 93 

35 Val Thr Leu lie Cys Arg Arg Arg Thr Gly Thr Asp Tyr Asp Glu Lys 
130 135 140 

gee cag gag tct gtg gga ate tat gag gtc ace cac cag ttt gtg aag 541 
Ala Gin Glu Ser Val Gly lie Tyr Glu Val Thr His Gin Phe Val Lys 
145 150 155 

40 tgc tgatccctct tccttcccag ttgcccttaa gaactgagaa aggacaaagt 594 
Cys 
160 

actctaagca gcagagccca cagaggctcg ttcctttgac ccttgtctcc tggtggctat 654 
acgaaacctt cacaatctgc atgctggact ttattacagc ttcccaagcc ccatcaataa 714 
45 agcccctgtt cacgctacaa aaaaaaaaaa aaaaa 749 

<210> 208 
<211> 594 
<212> DNA 
50 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 117 . .467 

55 

<400> 208 

aaatgtagcc tggtggtgtt cccaggagga aaagaacgag agactggtgg cagcacaccc 60 
tgggcccccc actccccgcc gcaagtcctg aggatggeca gcagagaaac aagaaa atg 119 

Met 

60 1 

gac tec ctg get get gga gag ttg aat gee age cac cag cca tgg gtg 167 
Asp Ser Leu Ala Ala Gly Glu Leu Asn Ala Ser His Gin Pro Trp Val 
5 10 15 

207 
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cca gag ttt 
Pro Glu Phe 
20 

age ctg cac 
5 Ser Leu His 
35 

gcg ctg agg 
Ala Leu Arg 
50 

10 ctt gca gca 
Leu Ala Ala 



15 



tct gta tat 
Ser Val Tyr 



tgt gtt tgc 
Cys Val Cys 
100 

ggt ggt ata 
20 Gly Gly He 
115 
ttactctttg 
gttgttaacg 



gta 
Val 

age 
Ser 

agg 
Arg 

ttc 
Phe 

ggc 

Gly 

85 

acc 

Thr 

teg 
Ser 



gee tat tgg 
Ala Tyr Trp 



egg gee 
Arg Ala 



gee 
Ala 

cca 

Pro 

70 

aga 

Arg 



ccc 

Pro 

55 

gca 

Ala 

get 
Ala 



ttt 

Phe 

40 

gag 

Glu 

gag 

Glu 

ccc 
Pro 



cca gtc tct 
Pro Val Ser 

taggtggctt 



agg aaa aca 

Arg Lys Thr 
25 

gga etc ctg 

Gly Leu Leu 

cca gta cca 
Pro Val Pro 

gca teg cct 
Ala Ser Pro 
75 

aga tat atg 
Arg Tyr Met 
90 

aaa aat tea 
Lys Asn Ser 
105 

taatacgtgt t 



cac caa gat cac etc tgc 

His Gin Asp His Leu Cys 
30 

gat get aga gtg acc tgg 

Asp Ala Arg Val Thr Trp 
45 

gga aag gat aga etc ctg 

Gly Lys Asp Arg Leu Leu 

60 65 

gtg gac acc gcg tct gtg 

Val Asp Thr Ala Ser Val 
80 

cac aag gga gtg aaa aaa 

His Lys Gly Val Lys Lys 
95 

aca gec tgg tta ctt ctg 

Thr Ala Trp Leu Leu Leu 
110 

atttgetea tctgtatttc 



cacaattaaa ccatgttcct tttacttatg tacattttta ataaaagaaa 
aaaaaaaaaa aaaaaaa 



215 



263 



311 



359 



407 



455 



507 



567 
594 



25 <210> 209 

<211> 2098 

<212> DNA 

<213> Homo sapiens 

30 <220> 

<221> CDS 

<222> 893 . . 1897 



<400> 209 
35 accaggtcct 
tttcagagtc 
ggcccgcgtt 
agtcaggccg 
cttctcatgg 
40 cgtggctttc 
gaaccacagg 
gtcttgtgct 
gcaacagtga 
ctccaggaat 
45 tctggaggac 
ggcgagcccc 
ggcacccatc 
tggacctgcc 
geccttgagt 

50 



ccgtggtgca 
accccttctg 
tctcagggtc 
ccgtgcggca 
aacgatcctc 
tgcatggaag 
atgcgtgaat 
gtgattagtg 
eggggaaggg 
ccagggtctc 
gtceggaget 
taccccagag 
tctgagccca 
tgctggctct 
ggcggggtgg 



cctgaaatgg 
cctcctgctc 
ttcctaatcc 
ggctgttaac 
gccttccctc 
cttggtggcc 
cggtctcctt 
ggtcactggc 
ctatgggtcc 
agcccctgct 
acagggegga 
tgaaggtgga 
tcgagtggaa 
gggatttttt 
acagcgcagc 



tcaacagaag 
aagcaacaga 
cctgggcttt 
ctagcctcgg 
atctccattg 
agggtgctgt 
gtcttcatgg 
aagtgtctga 
gectcaatgt 
ttagaaggaa 
gatttcatct 
ctttgccctc 
ataccacagc 
aagacgaagt 
caccgcctgc 



tgc 
Cys 

55 gat 
Asp 

cga 
Arg 
60 35 
aac 
Asn 



cag gtc tgc 
Gin Val Cys 
5 

gtc cgc acc 
Val Arg Thr 
20 

gac etc tgt 
Asp Leu Cys 

tec tec cag 
Ser Ser Gin 



gag gee gtg 
Glu Ala Val 

ate gtg aac 
He Val Asn 
25 

gga cgc ata 
Gly Arg He 
40 

gag acg tgc 
Glu Thr Cys 
55 



agg agt gga aat 
Arg Ser Gly Asn 
10 

cag ate age tac 
Gin He Ser Tyr 

ctg acc acc tgc 
Leu Thr Thr Cys 
45 

acc egg gee aga 
Thr Arg Ala Arg 
60 



tcttgtgaca cgtggaatca 
cctgccgatc acccccgtcg 
ccggcttgtc gtgtgcctgg 
ggagagtggg atggagccac 
ttttatggct tcacacggac 
cactttggga ageagecaga 
gcatctccgg ccagggtggt 
atgaagtgga ggttccggtg 
catctgcccc atccctgggc 
gtcctgaegg ccacgctgga 
cgaaacctgg cggccagcag 
tcgtgccacg aggacttget 
cctgaggagg agataagect 
caacaggcag ggtttttgct 
ctcatctact cc atg tgc 

Met Cys 

1 

gag gaa gtg ctg get 
Glu Glu Val Leu Ala 
15 

acc ccc cag gat ccc 
Thr Pro Gin Asp Pro 
30 

tac atg gee age aag 
Tyr Met Ala Ser Lys 
50 

gag ttg gee cag cag 
Glu Leu Ala Gin Gin 
65 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
898 



946 



994 



1042 



1090 



208 
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att gga age cac cac ate agt etc aac ate gat cca gee gtg aag gee 1138 
lie Gly Ser His His lie Ser Leu Asn lie Asp Pro Ala Val Lys Ala 

70 75 80 

gtc atg ggc ate ttc age ctg gtg acg ggg aag age cct ctg ttt gca 1186 
5 Val Met Gly lie Phe Ser Leu Val Thr Gly Lys Ser Pro Leu Phe Ala 
85 90 95 

get cat gga gga age age agg gaa aac ctg gcg ctg caa aat gtg cag 1234 
Ala His Gly Gly Ser Ser Arg Glu Asn Leu Ala Leu Gin Asn Val Gin 
100 105 110 

10 get cga ata egg atg gtc etc gee tat ctg ttt get cag ttg age etc 1282 
Ala Arg lie Arg Met Val Leu Ala Tyr Leu Phe Ala Gin Leu Ser Leu 
115 120 125 130 

tgg tct egg ggt gtc cac ggt ggg etc etc gtg ctg gga tec gee aac 1330 
Trp Ser Arg Gly Val His Gly Gly Leu Leu Val Leu Gly Ser Ala Asn 
15 ~* 135 140 145 

gtg gat gag agt etc ctg ggc tac ctg ace aag tac gac tgc tec agt 1378 
Val Asp Glu Ser Leu Leu Gly Tyr Leu Thr Lys Tyr Asp Cys Ser Ser 

150 155 160 

gcg gac ate aac ccc ata ggc ggg ate age aag acg gac etc agg gee 1426 
20 Ala Asp lie Asn Pro lie Gly Gly lie Ser Lys Thr Asp Leu Arg Ala 
165 170 175 

ttc gtc cag ttc tgc ate cag cgc ttc cag ctt cct gee ctg cag age 1474 
Phe Val Gin Phe Cys lie Gin Arg Phe Gin Leu Pro Ala Leu Gin Ser 
180 185 190 

25 ate ctg ttg gcg ccg gee ace gca gag ctg gag ccc ttg get gat gga 1522 
lie Leu Leu Ala Pro Ala Thr Ala Glu Leu Glu Pro Leu Ala Asp Gly 
195 200 205 210 

cag gtg tec cag ace gac gag gaa gat atg ggg atg aca tat gcg gag 1570 
Gin Val Ser Gin Thr Asp Glu Glu Asp Met Gly Met Thr Tyr Ala Glu 
30 215 220 225 

etc teg gtc tat ggg aaa etc agg aag gtg gee aag atg ggg ccc tac 1618 
Leu Ser Val Tyr Gly Lys Leu Arg Lys Val Ala Lys Met Gly Pro Tyr 

230 235 240 

age atg ttc tgc aaa etc etc ggc atg tgg aga cac ate tgc ace ccg 1666 
35 Ser Met Phe Cys Lys Leu Leu Gly Met Trp Arg His lie Cys Thr Pro 
245 250 255 

aga cag gtc get gac aaa gtg aag egg ttt ttc tec aag tac tec atg 1714 
Arg Gin Val Ala Asp Lys Val Lys Arg Phe Phe Ser Lys Tyr Ser Met 
260 265 270 

40 aac aga cac aag atg ace acg etc aca ccc gcg tac cac gee gag aac 1762 
Asn Arg His Lys Met Thr Thr Leu Thr Pro Ala Tyr His Ala Glu Asn 
275 280 285 290 

tac age cct gag gac aac agg ttt gat ctg cga cca ttt ctg tac aac 1810 
Tyr Ser Pro Glu Asp Asn Arg Phe Asp Leu Arg Pro Phe Leu Tyr Asn 
45 295 300 305 

aca age tgg cct tgg cag ttt egg tgc ata gaa aat cag gtg eta cag 1858 
Thr Ser Trp Pro Trp Gin Phe Arg Cys lie Glu Asn Gin Val Leu Gin 

310 315 320 

etc gag agg gca gag cca cag tec ctg gac ggc gtg gac tgaggccggt 1907 
50 Leu Glu Arg Ala Glu Pro Gin Ser Leu Asp Gly Val Asp 
325 330 335 

tccttcctgg aggcctcctg tectegggga ccccagcacc tcatcatcag cattgctgga 1967 
gecaagggta ggagccctac actaggagee caggatggga cggcgcatca gecgagaggg 2027 
agggaacttt tcagtcaaat tcctcaaaaa gaggctggaa taaagcctgg gctcaaaaaa 2087 
55 aaaaaaaaaa a 2098 

c210> 210 
<211> 428 
<212> DNA 
60 <213> Homo sapiens 

<220> 
<221> CDS 

209 
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30 



207 



255 



<222> 85. .342 
<400> 210 

acactggtac agtcacctag cccatcagtt ccttcgtcga cagcgccggg gacatccaga 60 
5 ctacaattta cagttcctct atcc atg tgc tgg gtt ata aat cat gcc ate 111 

Met Cys Trp Val lie Asn His Ala lie 
1 5 

etc cct aga atg aga atg cac age aag egg cag aca ate ace egg cat 159 
Leu Pro Arg Met Arg Met His Ser Lys Arg Gin Thr lie Thr Arg His 
10 10 ~ 15 20 25 

teg gca tct ctt tct ttt cac gcg etc cct cgc tec gcc ttt etc cag 
Ser Ala Ser Leu Ser Phe His Ala Leu Pro Arg Ser Ala Phe Leu Gin 

30 35 40 

etc tgc ctt etc agg cag ata cat cag ata cct tgt tta tec ate ttc 
15 Leu Cys Leu Leu Arg Gin lie His Gin lie Pro Cys Leu Ser lie Phe 
45 50 55 

age tec act ctg agg gcg cag acg cac gat tec ggg ate ggg tgc ace 303 
Ser Ser Thr Leu Arg Ala Gin Thr His Asp Ser Gly lie Gly Cys Thr 
60 65 70 

20 acg gcg aas cca ggc ggg aga egg cag gag cag etc agg taaccagggg 352 
Thr Ala Xaa Pro Gly Gly Arg Arg Gin Glu Gin Leu Arg 

75 80 85 

aagcttgcgt gcccacggag atgcagccgt ggagctgtga ggaaagaegg tctggcttca 412 
aaaaaaaaaa aaaaaa 42 8 

25 

<210> 211 
<211> 769 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 155 . .433 

35 <400> 211 

atttttcccc ccttgctcgg gatggtgcca caggaggctg tgcgggcccc gctccgcttc 60 

gaatggtgga tgctgtgggg caccacctcc ttgaggacca aggcactcca getgecagga 120 
atttggctgc taacctcaca cagctgagcc ttcc atg aaa att get etc tgc caa 175 

Met Lys lie Ala Leu Cys Gin 
40 15 

aga gaa ctt cct agt cca agg tea tgt eta etc tec aga gat gtg act 223 

Arg Glu Leu Pro Ser Pro Arg Ser Cys Leu Leu Ser Arg Asp Val Thr 

10 15 20 

gga gtg att tgc ace egg atg cct aga etc gcc ate tgc tea aag act 271 

45 Gly Val lie Cys Thr Arg Met Pro Arg Leu Ala He Cys Ser Lys Thr 

25 30 35 

get cag aaa gcc etc cca tgc att ccc ctg ctg cat ace age cca etc 319 

Ala Gin Lys Ala Leu Pro Cys lie Pro Leu Leu His Thr Ser Pro Leu 
40 45 50 55 

50 tgc ctg cag ctg ctg tct gca gga ctt cat ate tat gcc aca ctg tgt 367 

Cys Leu Gin Leu Leu Ser Ala Gly Leu His He Tyr Ala Thr Leu Cys 

60 65 70 

aaa age tgt get tea aga aat cac aaa aac att ttc ctg cac eta eta 415 

Lys Ser Cys Ala Ser Arg Asn His Lys Asn He Phe Leu His Leu Leu 
55 75 80 85 

cac age ctg agt gcg gca taagttgacc ttgettgeta agaaatgggg 463 
His Ser Leu Ser Ala Ala 
90 

caagaaatgc ttttttgtat gtgtcatgtc tgtttgtttt tcaattaaga gaggaaagca 523 

60 ttaggcagat ggaatgtaca tgtgaggatg aggagacaga aaacaagtag ccctttccat 583 

caagatagag ggttttctgg ggttgctggc tattgaatgt cactcctgat ttctctttcc 643 

aaggcactgt accaccagcc tactgagatt gtgtgggagc tttcatgggg gttgtatttc 703 

actgatgaaa ataaattttt tgcataatgt gaaaaaaaaa aaaaaaaaga aaaaaaaaaa 763 

210 
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aaaaaa 



769 



<210> 212 
<211> 914 
5 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
10 <222> 63 . .386 

<400> 212 

ctttttaggg agtccaaggt acagtcgccg cgtgcggagc ttgttactgg ttacttggcc 60 
tc atg gcg gtc cga get teg ttc gag aac aac tgt gag ate ggc tgc 107 

15 Met Ala Val Arg Ala Ser Phe Glu Asn Asn Cys Glu lie Gly Cys 
1 5 10 - 15 

ttt gee aag etc ace aac ace tac tgt ctg gta gcg ate gga ggc tea 155 
Phe Ala Lys Leu Thr Asn Thr Tyr Cys Leu Val Ala He Gly Gly Ser 
20 25 30 

20 gag aac ttc tac agt gtg ttc gag ggc gag etc tec gat acc ate ccc 203 
Glu Asn Phe Tyr Ser Val Phe Glu Gly Glu Leu Ser Asp Thr He Pro 

35 40 45 

gtg gtg cac gcg tct ate gee ggc tgc cgc ate ate ggg cgc atg tgt 251 
Val Val His Ala Ser He Ala Gly Cys Arg He He Gly Arg Met Cys 

25 50 55 60 

gtg gga gac aga aga aat tct ggc aga tgt get caa ggt gga agt ctt 2 99 

Val Gly Asp Arg Arg Asn Ser Gly Arg Cys Ala Gin Gly Gly Ser Leu 

65 70 75 

cag aca gac agt ggc cga cca ggt get agt agg aag eta ctg tgt ctt 34 7 

30 Gin Thr Asp Ser Gly Arg Pro Gly Ala Ser Arg Lys Leu Leu Cys Leu 
80 85 90 95 

cag caa tea ggg agg get ggt gca tec caa gac ttc aat tgaagaccag 3 96 

Gin Gin Ser Gly Arg Ala Gly Ala Ser Gin Asp Phe Asn 
100 105 

35 gatgagctgt cctctcttct tcaagtcccc cttgtggcgg ggactgtgaa ccgaggcagt 456 
gaggtgattg ctgctgggat ggtggtgaat gactggtgtg ccttctgtgg cctggacaca 516 
accagcacag agctgtcagt ggtggagagt gtcttcaagc tgaatgaagc ccagcctagc 576 
accattgcca ccagcatgcg ggattccctc attgacagee tcacctgagt caccttccaa 636 
gttgttccat gggctcctgg ctctggactg tggccaacct tctccacatt ccgcccaatc 696 

40 tgtacctgat gctggcaggg aggtggcaga gagctcactg ggactgaggg gctgggcacc 756 
caaccctttt ccacctgtgc ttatcgcctg gatctatcat tactgeaaaa acctgctctg 816 
ttgtgctggc tggcaggccc tgtggctgct ggctgagggt tctgctgtcc tgtgccaccc 876 
cattaaagtg cagttccctc caaaaaaaaa aaaaaaaa 914 

45 <210> 213 

<211> 1489 

<212> DNA 

<213> Homo sapiens 

50 <220> 

<221> CDS 

<222> 460. .1290 



<400> 213 
55 cttctttccc 
ccagcaccct 
ggcttggatt 
tacttgaatt 
gataattgat 
60 gecagaggag 
gacctctttg 
tggactctct 



tctccgtttt ggtgggctgg ttgaagatga aatccactga ggagggaagt 60 

gtgtgccagt ccagaactgg cccatctgta gaccccctga aaatcatatg 120 

tggatattct caacagaaag ggttaaaggc tgatggtacc taaagcctgg 180 

ttgatcaaga taagctgect taagttctct tcattacaca aatgatccta 240 

agatcctgtg gttcaactgg atttctagat agaagctgga ttcatgtgat 300 

taaaatttca agagactgaa accagatctg agtttcgctg ttccagtctg 360 

gtgctgtaaa tcctggatat actgtagatg agtactgcgt ttttctttta 420 

tcagcttctg gagacctcac tatcctatt atg tct ttg tgt gaa 474 

Met Ser Leu Cys Glu 



211 
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522 



1 5 
gac atg ctg ctt tgt aat tat cga aag tgt cgc ate aaa etc tct ggc 
Asp Met Leu Leu Cys Asn Tyr Arg Lys Cys Arg lie Lys Leu Ser Gly 
10 15 20 

5 tat gca tgg gtc act gee tgc tct cac ate ttc tgt gat cag cat ggc 570 
Tyr Ala Trp Val Thr Ala Cys Ser His lie Phe Cys Asp Gin His Gly 

25 30 35 

agt ggt gag ttt agt cgc tea cca get ate tgt cct gee tgc aac agt 618 
Ser Gly Glu Phe Ser Arg Ser Pro Ala lie Cys Pro Ala Cys Asn Ser 
10 40 45 50 

acc ctt tct gga aag eta gat att gtc cgc aca gaa etc agt cca tea 666 
Thr Leu Ser Gly Lys Leu Asp lie Val Arg Thr Glu Leu Ser Pro Ser 

55 60 65 

gag gaa tat aaa get atg gta ttg gca gga ctg cga cca gag ate gtg 714 
15 Glu Glu Tyr Lys Ala Met Val Leu Ala Gly Leu Arg Pro Glu lie Val 
70 75 80 85 

ttg gac att age tec cga gcg ctg gee ttc tgg aca tat cag gta cat 762 
Leu Asp lie Ser Ser Arg Ala Leu Ala Phe Trp Thr Tyr Gin Val His 
90 95 100 

20 cag gaa cgt etc tat caa gaa tac aat ttc age aag get gag ggc cat 810 
Gin Glu Arg Leu Tyr Gin Glu Tyr Asn Phe Ser Lys Ala Glu Gly His 

105 110 115 

ctg aaa cag atg gag aag ata tat act cag caa ata caa age aag gat 858 
Leu Lys Gin Met Glu Lys lie Tyr Thr Gin Gin lie Gin Ser Lys Asp 
25 120 125 130 

gta gaa ttg acc tct atg aaa ggg gag gtt acc tec atg aag aaa gta 906 
Val Glu Leu Thr Ser Met Lys Gly Glu Val Thr Ser Met Lys Lys Val 

135 140 145 

eta gaa gaa tac aag aaa aag ttc agt gac ate tct gag aaa ctt atg 954 
30 Leu Glu Glu Tyr Lys Lys Lys Phe Ser Asp lie Ser Glu Lys Leu Met 
150 155 160 165 

gag cgc aat cgt cag tat caa aag etc caa ggc etc tat gat age ctt 1002 
Glu Arg Asn Arg Gin Tyr Gin Lys Leu Gin Gly Leu Tyr Asp Ser Leu 
170 175 180 

35 agg eta cga aac ate act att get aac cat gaa ggc acc ctt gaa cca 1050 
Arg Leu Arg Asn lie Thr lie Ala Asn His Glu Gly Thr Leu Glu Pro 

185 190 195 

tec atg att gca cag tct ggt gtt ctt ggc ttc cca tta ggt aac aac 1098 
Ser Met lie Ala Gin Ser Gly Val Leu Gly Phe Pro Leu Gly Asn Asn 
40 200 205 210 

tec aag ttt cct ttg gat aat aca cct gtt cga aat egg ggc gat gga 1146 
Ser Lys Phe Pro Leu Asp Asn Thr Pro Val Arg Asn Arg Gly Asp Gly 

215 220 225 

gat gga gat ttt cag ttc aga cca ttt ttt gcg ggt tct ccc aca gca 1194 
45 Asp Gly Asp Phe Gin Phe Arg Pro Phe Phe Ala Gly Ser Pro Thr Ala 
230 235 240 245 

cct gaa ccc age aac age ttt ttt agt ttt gtc tct cca agt cgt gaa 1242 
Pro Glu Pro Ser Asn Ser Phe Phe Ser Phe Val Ser Pro Ser Arg Glu 
250 255 260 

50 tta gag cag cag caa gtt tct age agg gee ttc aaa gta aaa aga att 1290 
Leu Glu Gin Gin Gin Val Ser Ser Arg Ala Phe Lys Val Lys Arg lie 

265 270 275 

tgagccacgc atagtgtcac gcacctgtga tcccagctac ttaggaggtt gaggctggga 1350 
ggatcacttg ageccaggag tctgaggctt tagtgatcta agatcatgee actgcactcc 1410 
55 agectgggea acagagtgag accctgtttc taaaaaaaaa taaagataat ttagctaact 1470 
tcaaaaaaaa aaaaaaaaa 1489 

<210> 214 
<211> 776 
60 <212> DNA 

<213> Homo sapiens 

<220> 

212 
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<221> CDS 
<222> 21. .539 

<400> 214 

5 caaatatttc catcacgggg atg ctt gtc atg tac ctg ctt gcc gcc etc ttt 53 

Met Leu Val Met Tyr Leu Leu Ala Ala Leu Phe 
15 10 
ggt tac eta acc ttc tat gga gaa gtt gaa gat gaa tta ctt cat gcc 101 
Gly Tyr Leu Thr Phe Tyr Gly Glu Val Glu Asp Glu Leu Leu His Ala 
10 15 2 0 25 

tac age aaa gtg tat aca tta gac ate cct ctt etc atg gtt cgc ctg 149 
Tyr Ser Lys Val Tyr Thr Leu Asp lie Pro Leu Leu Met Val Arg Leu 

30 35 40 

gca gtc ctt gtg gca gta aca eta act gtg ccc att gtc etc ttc cca 197 
15 Ala Val Leu Val Ala Val Thr Leu Thr Val Pro lie Val Leu Phe Pro 
45 50 55 

att cgt aca tea gtg ate aca ctg tta ttt ccc aaa cga ccc ttc age 245 
lie Arg Thr Ser Val lie Thr Leu Leu Phe Pro Lys Arg Pro Phe Ser 
60 65 70 75 

20 tgg ata cga cat ttc ctg att gca get gtg ctt att gca ctt aat aat 293 
Trp lie Arg His Phe Leu lie Ala Ala Val Leu lie Ala Leu Asn Asn 

80 85 90 

gtt ctg gtc ate ctt gtg cca act ata aaa tac ate ttc gga ttc ata 341 
Val Leu Val lie Leu Val Pro Thr lie Lys Tyr lie Phe Gly Phe lie 
25 95 100 105 

ggg get tct tct gcc act atg ctg att ttt att ctt cca gca gtt ttt 389 
Gly Ala Ser Ser Ala Thr Met Leu lie Phe lie Leu Pro Ala Val Phe 

110 115 120 

tat ctt aaa ctt gtc aag aaa gaa act ttt agg tea ccc caa aag gtc 437 
30 Tyr Leu Lys Leu Val Lys Lys Glu Thr Phe Arg Ser Pro Gin Lys Val 
125 130 135 

ggg get tta att ttc ctt gtg gtt gga ata ttc ttc atg att gga age 485 
Gly Ala Leu lie Phe Leu Val Val Gly lie Phe Phe Met lie Gly Ser 
140" 145 150 155 

35 atg gca etc att ata att gac tgg att tat gat cct cca aat tec aag 533 
Met Ala Leu lie lie lie Asp Trp lie Tyr Asp Pro Pro Asn Ser Lys 

160 165 170 

cat cac taacacaagg aaaaatactt tctttttcta ttggaaatgg ttacaagtta 589 
His His 

40 tactccaaaa gatatttgaa ttatcttgat tggaatgtta ttcataggaa ataacaggaa 649 
gattccaaag aegtttacca gtmatatcac caggcacctg cagaagagga aaatcactgt 709 
ttttgtcaag gatggttgtg tatgtgttta aaataaaacc tgtggtgcac aaaaaaaaaa 769 
aaaaaaa 776 

45 <210> 215 
<211> 1412 
<212> DNA 
<213> Homo sapiens 

50 <220> 

<221> CDS 
<222> 34. .1143 

<400> 215 

55 atgcggtgaa gggegagegg cgcggcggct gcg atg agt gcc tct gcg gcc acc 54 

Met Ser Ala Ser Ala Ala Thr 
1 5 

ggg gtc ttc gtg ctg tec etc teg gcc ate ccg gtc acc tat gtc ttc 102 
Gly Val Phe Val Leu Ser Leu Ser Ala lie Pro Val Thr Tyr Val Phe 
60 10 15 20 

aac cac ctg gcg gcc cag cat gat tec tgg act att gta ggg gtt get 150 
Asn His Leu Ala Ala Gin His Asp Ser Trp Thr lie Val Gly Val Ala 
25 30 35 

213 
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246 
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Pro 


Pro 
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60 
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Pro 


Leu 


Phe 


Tyr 
65 
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70 
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75 
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He 
80 
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85 
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He 




10 
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342 
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90 
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He 
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438 




Tyr 
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Met 
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Val 


Met 


Val 


Ala 


Ala 


He 
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Trp 


Glu 


Glu 


Thr 






120 
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486 


20 


Tyr 
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He 


Gly 
140 
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Trp 


Val 


Gly 
145 


Ser 


He 


He 


Met 


Ser 
150 


Val 
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Val 
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Phe 
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Pro 


Gly 
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He 


Val 


Gly Lys 
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Thr 
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155 
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25 
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get 
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act 
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cct 
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582 
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Pro 
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170 
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He 
175 
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cctcctttct aaattactaa cttttgttat actggtactg atattttgtc ccatttcact 1223 

ctcttctcat acgtgagtac ttaagaatat gtacattctt gctctgcact gtatgtgtga 1283 

gctatatggt attgtgtaaa ttttttttga aggaaaatgg aaattcttga gaaacagttt 1343 

gtttaaagaa atatattcaa aatcatttgt gaataaactt gatcatccat ctcaaaaaaa 1403 

5 aaaaaaaaa 1412 

<210> 216 
<211> 1773 
<212> DNA 
10 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 6 . . 1184 

15 

<400> 216 

ccaac atg acc tac agg tgg ggg aca ctg etc atg aag aga aag ttt gag 50 
Met Thr Tyr Arg Trp Gly Thr Leu Leu Met Lys Arg Lys Phe Glu 
1 5 10 15 

20 gag ccc egg cca gga ttt cat ggt gtc ttg ggt ate aat tec ate act 98 
Glu Pro Arg Pro Gly Phe His Gly Val Leu Gly lie Asn Ser lie Thr 

20 25 30 

ggg aag gag gag cct ctg tac ccc age tac aag aga cag ttg cgc att 14 6 

Gly Lys Glu Glu Pro Leu Tyr Pro Ser Tyr Lys Arg Gin Leu Arg lie 
25 35 40 ^ 45 

tac ctg gtc tec ctg cca ttc gtg tgc etc tgc etc tat ttc tea ctg 194 
Tyr Leu Val Ser Leu Pro Phe Val Cys Leu Cys Leu Tyr Phe Ser Leu 

50 55 60 

tat gtc atg atg att tac ttc gac atg gag gtt tgg gee ttg ggt eta 242 
30 Tyr Val Met Met lie Tyr Phe Asp Met Glu Val Trp Ala Leu Gly Leu 
65 70 75 

cat gag aac age ggg tct gag tgg acc agt gtc ctg ttg tat gtg ccc 2 90 

His Glu Asn Ser Gly Ser Glu Trp Thr Ser Val Leu Leu Tyr Val Pro 
80 85 90 95 

35 age ate ate tat gee att gtg att gag ate atg aat cgt etc tat cga 338 
Ser lie lie Tyr Ala lie Val lie Glu lie Met Asn Arg Leu Tyr Arg 

100 105 110 

tat get gee gag ttt tta act tea tgg gag aat cac aga ttg gaa tct 3 86 

Tyr Ala Ala Glu Phe Leu Thr Ser Trp Glu Asn His Arg Leu Glu Ser 
40 115 120 125 

gee tat cag aac cat eta att ctg aaa gtt tta gtg ttc aac ttc etc 434 
Ala Tyr Gin Asn His Leu lie Leu Lys Val Leu Val Phe Asn Phe Leu 

130 135 140 

aat tgc ttt gec tea etc ttc tat att gee ttt gtc ttg aaa gat atg 482 
45 Asn Cys Phe Ala Ser Leu Phe Tyr lie Ala Phe Val Leu Lys Asp Met 
145 150 155 

aag ctt ttg cgc cag age ttg gee act etc eta att acc tec cag ate 530 
Lys Leu Leu Arg Gin Ser Leu Ala Thr Leu Leu lie Thr Ser Gin lie 
160 165 170 175 

50 etc aac caa att atg gaa tct ttt ctt cct tat tgg etc caa agg aag 578 
Leu Asn Gin lie Met Glu Ser Phe Leu Pro Tyr Trp Leu Gin Arg Lys 

180 185 190 

cat ggt gtg egg gtg aag agg aag gtg cag get tta aag gca gac att 626 
His Gly Val Arg Val Lys Arg Lys Val Gin Ala Leu Lys Ala Asp lie 
55 195 200 205 

gat get aca tta tat gaa caa gtc ate ctg gaa aaa gaa atg gga act 674 
Asp Ala Thr Leu Tyr Glu Gin Val lie Leu Glu Lys Glu Met Gly Thr 

210 215 220 

tat ttg ggc acc ttt gat gat tac ttg gag tta ttc ctg cag ttt ggt 722 
60 Tyr Leu Gly Thr Phe Asp Asp Tyr Leu Glu Leu Phe Leu Gin Phe Gly 
225 230 235 

tat gtg age ctt ttc tec tgt gtt tac cca tta gca get gee ttt get 770 
Tyr Val Ser Leu Phe Ser Cys Val Tyr Pro Leu Ala Ala Ala Phe Ala 

215 
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Gin 


Val 


Asn 


Ala 


val 


Phe 


Pro 


Glu 




305 










310 










315 












tea 


aaa 


gca 


gac 


etc 


att 


ttg 


att 


gta 


gta 


gca 


gtg 


gag 


cac 


gca 


etc 


15 


Ser 


Lys 


Ala 


Asp 


Leu 


He 


Leu 


He 


Val 


Val 


Ala 


Val 


Glu 


His 


Ala 


Leu 




320 










325 










330 










335 




ctg 


get 


tta 


aag 


ttt 


ata 


ctt 


gca 


ttt 


gee 


ata 


cct 


gat 


aag 


cca 


egg 




Leu 


Ala 


Leu 


Lys 


Phe 


He 


Leu 


Ala 


Phe 


Ala 


He 


Pro 


Asp 


Lys 


Pro 


Arg 












340 










345 










350 




20 


cat 


ate 


cag 


atg 


aaa 


eta 


gee 


aga 


ctg 


gaa 


ttt 


gag 


tct 


ttg 


gag 


gca 




His 


He 


Gin 


Met 


Lys 


Leu 


Ala 


Arg 


Leu 


Glu 


Phe 


Glu 


Ser 


Leu 


Glu 


Ala 



355 360 365 

etc aag cag cag caa atg aag etc gtg ace gag aac ctg aag gag gaa 
Leu Lys Gin Gin Gin Met Lys Leu Val Thr Glu Asn Leu Lys Glu Glu 

25 370 375 380 

cca atg gaa age ggg aag gag aag gca ace tgagtgccca gcgtgcccag 
Pro Met Glu Ser Gly Lys Glu Lys Ala Thr 

385 390 
ctgccctgtt ggcagaggee tgtgtctgtg ccacacctgc cacggtggca gggggggtac 

30 ccggggcagc atcgtggctc ctgaacccag acccaatgct tagecaaacg aagtggctcc 
catgtggcaa gcacccttct cagtttcgea gtggcttggc tegggatect tggcagttcc 
cccagcccca ccctgtctgc tccttcccag ttccttcccg ggccccacac gctgctccag 
ctgccaactt tgetgeagag ccactgccgc ccttgagcct ctcaccatga gtgagccacc 
agctctccac gttcccctca tagcagtgtc actcccaacc ccaccatggc ccagggaccc 

35 gtggacaggt tggggatggg gtgtgtgccc actgtgctca tcacaggagc ctcagttgag 
agtgagcggg gtacagtaag geagtgette ccacactgga cctctttcct ggttctcttt 
tgcaatacat taacagaccc tttatcaaca taaacaatag taactgagct attaaaggca 
aaaaaaaaaa taaaaaaaaa aaaaaaaaa 



818 



866 



914 



962 



1010 



1058 



1106 



1154 



1204 



1264 
1324 
1384 
1444 
1504 
1564 
1624 
1684 
1744 
1773 



40 <210> 217 

<211> 1251 

<212> DNA 

<213> Homo sapiens 

45 <220> 

<221> CDS 
<222> 29. .376 

<400> 217 

50 tatccggtcc teggctgegg cgggcacc atg gtc ggt ggc gag gcg get gee 52 

Met Val Gly Gly Glu Ala Ala Ala 
1 5 

gca gtg gag gag ctg gtt teg ggg gtg egg cag gcg gee gac ttc gcg 100 
Ala Val Glu Glu Leu Val Ser Gly Val Arg Gin Ala Ala Asp Phe Ala 

55 10 15 20 

gag cag ttc cgc tec tac tea gag age gag aag caa tgg aag gee cgc 14 8 

Glu Gin Phe Arg Ser Tyr Ser Glu Ser Glu Lys Gin Trp Lys Ala Arg 

25 30 35 40 

atg gaa ttc ate ctg cgc cac ctg ccc gac tac cgc gac ccg ccc gac 196 

60 Met Glu Phe He Leu Arg His Leu Pro Asp Tyr Arg Asp Pro Pro Asp 

45 50 55 

ggc agt ggc cgc ctg gac cag ctg etc tec etc tec atg gtc tgg gee 244 
Gly Ser Gly Arg Leu Asp Gin Leu Leu Ser Leu Ser Met Val Trp Ala 
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60 65 70 

aac cat etc ttc eta ggc tgc agt tac aat aaa gac ctt tta gac aag 292 
Asn His Leu Phe Leu Gly Cys Ser Tyr Asn Lys Asp Leu Leu Asp Lys 
75 80 85 

5 gtg atg gaa atg gec gat ggg att gaa gtg gaa gac ctg cca caa ttt 340 
Val Met Glu Met Ala Asp Gly lie Glu Val Glu Asp Leu Pro Gin Phe 

90 95 100 

act ace aga agt gaa tta atg aaa aag cat caa age taagccagaa 386 
Thr Thr Arg Ser Glu Leu Met Lys Lys His Gin Ser 

10 105 110 115 

gatttatcac attttcatca tcagctacag gattagaaag gaggctggga tgaatgtgac 446 

atagaccaca gcagctctct taagactcct ggtattacca acataaagag gcaggtggaa 506 

tgagaaggac tctgtctaga ttggcttttt taacattctc attttcccag gagttatcac 566 

tgtaaaagta tgcatggata tttatgtatt tataaatcat gcactctaag atgagttcat 626 

15 caacattgta aaagccctct tttctgtttt caggtttttt tttttcttat cgacaaggtc 686 

tcactctgtc gcccaggcag aatcacaaag gtgeattatt ggctcattgc agcctcgaac 746 

tcctgggctc atattttcag ggttttttgt tttttgtttt gtttttttga gacagagtct 806 

tgctctgttg cccaggcagt agtgcmagtg gegegatata ttttcagttt ttaaaegtea 866 

gaatttttgt ttaaaatgee tttttgggct gggccacagt ggccttatgc ccataataat 926 

20 cccagcactt tgggaggccg aggtgagcag atcacctgag gttaggagtt tgagaccagc 986 

ctggccaaca cgatgaaacc ccgtctctac taaaaataca aacaaaatta gctgggcatg 1046 

gtggcggaca tctgtaatcc cagctactca ggaggctgaa gcagaagaac tgettgaace 1106 

tgggaggtgg aggttgcagt gagecaagat cgcaccattg cactccatcc tgggcgacaa 1166 

aaatgaaaca ccgtctcaaa aaaaaaataa aaataataaa ataaaatgee tttttgttgt 1226 

25 tgatgtgaaa aaaaaaaaaa aaaaa 1251 

<210> 218 

<211> 894 

<212> DNA 

30 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 78. .566 

35 

<400> 218 

gcgcgccatc ttggctccgg atcgtgcgtg aggeggctte gtgggcagcg agagtcacag 60 
acaagacagc aagcagg atg gag cac tac egg aaa get ggc tct gta gag 110 

Met Glu His Tyr Arg Lys Ala Gly Ser Val Glu 
40 1 5 10 

etc cca gcg cct tec cca atg ccc cag eta cct cct gat acc ctt gag 158 
Leu Pro Ala Pro Ser Pro Met Pro Gin Leu Pro Pro Asp Thr Leu Glu 

15 20 25 

atg egg gtc cga gat ggc age aaa att cgc aac ctg ctg ggg ttg get 2 06 

45 Met Arg Val Arg Asp Gly Ser Lys lie Arg Asn Leu Leu Gly Leu Ala 
30 35 40 

ctg ggt egg ttg gag ggc ggc agt get egg cat gta gtg ttc tea ggt 254 
Leu Gly Arg Leu Glu Gly Gly Ser Ala Arg His Val Val Phe Ser Gly 
45 50 55 

50 tct ggc agg get gca gga aag get gtc age tgc get gag att gtc aag 302 
Ser Gly Arg Ala Ala Gly Lys Ala Val Ser Cys Ala Glu He Val Lys 
60 65 70 75 

egg egg gtc cca ggc ctg cac cag etc acc aag eta cgt ttc ctt cag 350 
Arg Arg Val Pro Gly Leu His Gin Leu Thr Lys Leu Arg Phe Leu Gin 
55 80 85 90 

act gag gac age tgg gtc cca gee tea cct gac aca ggg eta gac ccc 398 
Thr Glu Asp Ser Trp Val Pro Ala Ser Pro Asp Thr Gly Leu Asp Pro 

95 100 105 

etc aca gtg cgc cgc cat gtg cct gca gtg tgg gtg ctg etc age egg 446 
60 Leu Thr Val Arg Arg His Val Pro Ala Val Trp Val Leu Leu Ser Arg 
110 115 120 

gac ccc ctg gac ccc aat gag tgt ggt tac caa ccc cca gga gca ccc 4 94 

Asp Pro Leu Asp Pro Asn Glu Cys Gly Tyr Gin Pro Pro Gly Ala Pro 
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125 130 135 

cct ggc ctg ggt tec atg ccc age tec age tgt ggc cct cgt tec cga 542 
Pro Gly Leu Gly Ser Met Pro Ser Ser Ser Cys Gly Pro Arg Ser Arg 
140 145 150 155 

5 aga agg get cga gac acc cga teg tgaagacctg ctgagccagc ctgttctccg 596 
Arg Arg Ala Arg Asp Thr Arg Ser 
160 

ggcctgaatg tctggggtgc ttgtgccttt tctgagaagc gttgtgactg ctcaacatcc 656 
ccatcaaggt ttgagtccac aaaagtggac ctccctatca tgcttcccct tccctctagc 716 
10 atgtgggaag ggactgctgt gaagaatgac agatgtgggg cctctgccaa gttctgeatt 776 
gctaaataag ggcttcctct gccttctacc tacagtgeat ttgaactgee ttctgaaaga 836 
ggtccaggga gggatttagg aaataaagtt tctacctatt taaaaaaaaa aaaaaaaa 894 

<210> 219 

15 <211> 910 

<212> DNA 

<213> Homo sapiens 

<220> 
20 <221> CDS 

<222> 16. .705 

<400> 219 

acatgageca ccaaa atg gtg gtg ttc ggg tat gag get ggg act aag cca 51 
25 Met Val Val Phe Gly Tyr Glu Ala Gly Thr Lys Pro 

15 10 
agg gat tea ggt gtg gtg ccg gtg gga act gag gaa gcg ccc aag gtt 99 
Arg Asp Ser Gly Val Val Pro Val Gly Thr Glu Glu Ala Pro Lys Val 
15 20 25 

30 ttc aag atg gca gca tct atg cat ggt cag ccc agt cct tct eta gaa 147 
Phe Lys Met Ala Ala Ser Met His Gly Gin Pro Ser Pro Ser Leu Glu 

30 35 40 

gat gca aaa etc aga aga cca atg gtc ata gaa ate ata gaa aaa aat 
Asp Ala Lys Leu Arg Arg Pro Met Val lie Glu lie lie Glu Lys Asn 
35 45 50 55 60 

ttt gac tat ctt aga aaa gaa atg aca caa aat ata tat caa atg gcg 243 
Phe Asp Tyr Leu Arg Lys Glu Met Thr Gin Asn lie Tyr Gin Met Ala 

65 70 75 

aca ttt gga aca aca get ggt ttc tct gga ata ttc tea aac ttc ctg 291 
40 Thr Phe Gly Thr Thr Ala Gly Phe Ser Gly lie Phe Ser Asn Phe Leu 
80 85 90 

ttc aga cgc tgc ttc aag gtt aaa cat gat get ttg aag aca tat gca 339 
Phe Arg Arg Cys Phe Lys Val Lys His Asp Ala Leu Lys Thr Tyr Ala 
95 100 105 

45 tea ttg get aca ctt cca ttt ttg tct act gtt gtt act gac aag ctt 387 
Ser Leu Ala Thr Leu Pro Phe Leu Ser Thr Val Val Thr Asp Lys Leu 

110 115 120 

ttt gta att gat get ttg tat tea gat aat ata age aag gaa aac tgt 435 
Phe Val lie Asp Ala Leu Tyr Ser Asp Asn lie Ser Lys Glu Asn Cys 
50 125 130 135 140 

gtt ttc aga age tea ctg att ggc ata gtt tgt ggt gtt ttc tat ccc 
Val Phe Arg Ser Ser Leu lie Gly lie Val Cys Gly Val Phe Tyr Pro 

145 150 155 

agt tct ttg get ttt act aaa aat gga cgc ctg gca acc aag tat cat 531 
55 Ser Ser Leu Ala Phe Thr Lys Asn Gly Arg Leu Ala Thr Lys Tyr His 
160 165 170 

acc gtt cca ctg cca cca aaa gga agg gtt tta ate cat tgg atg acg 
Thr Val Pro Leu Pro Pro Lys Gly Arg Val Leu lie His Trp Met Thr 
175 180 185 

60 ctt tgt caa aca caa atg aaa tta atg gcg att cct eta gtc ttt cag 
Leu Cys Gin Thr Gin Met Lys Leu Met Ala lie Pro Leu Val Phe Gin 

190 195 200 

att atg ttt gga ata tta aat ggt eta tac cat tat gca gta ttt gaa 675 

218 



195 



483 



579 



627 
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10 



15 



lie Met Phe Gly lie Leu Asn Gly Leu Tyr His Tyr Ala Val Phe Glu 
205 210 215 220 

gag aca ctt gag aaa act ata cat gaa gag taaccaaaaa aatgaatggt 725 
Glu Thr Leu Glu Lys Thr lie His Glu Glu 
225 230 
tgctaactta gcaaaatgaa gtttctataa agaggactca ggcattgctg aaagagttaa 785 
aagtaactgt gaacaaataa tttgttctgt gccttttgcc tggtatatag caaatactca 845 
aaaaatattc aataattcaa tcaataaata taagtttcat cttacaccaa aaaaaaaaaa 905 
aaaaa 910 

<210> 220 
<211> 519 
<212> DNA 
<213> Homo sapiens 

<220> 
<221> CDS 
<222> 103 . .405 

20 <400> 220 

acttccggtg cgaaccgcct cggccgttcc ctcgcggagc ttactgagcg cggccgccga 60 

gcccagctcc gccgccgagc gcctgtgccg gcacgbhaca cc atg gag cgc ccg 114 

Met Glu Arg Pro 

1 

25 gat aag gcg gcg ctg aac gca ctg cag cct cct gag ttc aga aat gaa 162 

Asp Lys Ala Ala Leu Asn Ala Leu Gin Pro Pro Glu Phe Arg Asn Glu 
5 10 15 20 

age tea tta gca tct aca ctg aag acg etc ctg ttc ttc aca get tta 210 

Ser Ser Leu Ala Ser Thr Leu Lys Thr Leu Leu Phe Phe Thr Ala Leu 

30 25 30 35 

atg ate act gtt cct att ggg tta tat ttc aca act aaa tct tac ata 258 

Met lie Thr Val Pro lie Gly Leu Tyr Phe Thr Thr Lys Ser Tyr lie 

40 45 50 

ttt gaa ggc gec ctt ggg atg tec aat agg gac age tat ttt tac get 306 

35 Phe Glu Gly Ala Leu Gly Met Ser Asn Arg Asp Ser Tyr Phe Tyr Ala 
55 60 65 

get att gtt gca gtg gtc gee gtc cat gtg gtg ctg gee etc ttt gtg 354 

Ala lie Val Ala Val Val Ala Val His Val Val Leu Ala Leu Phe Val 

70 75 80 

40 tat gtg gee tgg aat gaa ggc tea cga cag tgb cgt gaa ggc aaa cag 4 02 

Tyr Val Ala Trp Asn Glu Gly Ser Arg Gin Xaa Arg Glu Gly Lys Gin 
85 90 95 100 
gat taaagtgaac atcacctttt tatagcatta aattcatttt ttaaaatgat 455 
Asp 

45 aatgctggag ggggecatet gatttgaata aagttgaaag aacatgtaaa aaaaaaaaaa 515 
aaaa 519 

<210> 221 
<211> 632 
50 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
55 <222> 72 . .350 

<400> 221 

agtgagaccg cgcggcaaca gettgegget geggtagtec cgtgggcgct ccgctggctg 60 
tgeaggegge c atg gat tec ttg egg aaa atg ctg ate tea gtc gca atg 110 
60 Met Asp Ser Leu Arg Lys Met Leu lie Ser Val Ala Met 

1 5 10 

ctg ggc gca ggg get ggc gtg ggc tac gcg etc etc gtt ate gtg acc 158 
Leu Gly Ala Gly Ala Gly Val Gly Tyr Ala Leu Leu Val lie Val Thr 
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15 

ccg gga gag 
Pro Gly Glu 
30 

5 gac cca agg 
Asp Pro Arg 



10 



15 



gcc act ctg 
Ala Thr Leu 

aag aac tgg 
Lys Asn Trp 
80 

tgagaccgga 
gcctttctcc 
cggggtcctg 
ttaaggtccg 
ccaataaaat 



20 

egg egg aag cag gaa atg eta aag 
Arg Arg Lys Gin Glu Met Leu Lys 

35 40 
age aga gag gag gcg gcc agg ace 
Ser Arg Glu Glu Ala Ala Arg Thr 

50 55 
cag gag gca gcg acc acg cag gag 
Gin Glu Ala Ala Thr Thr Gin Glu 
65 70 
atg gtt ggc ggc gaa ggc ggc gcc 
Met Val Gly Gly Glu Gly Gly Ala 
85 

cttgcctccg tgggegcegg accttggctt 
ttcgtgggcc cageggagag tccggaccga 
tgagctgccg tegggtgage acgtttcccc 
caaggeggge cagggecgag aegegagteg 
catgttcctc cacccaaaaa aaaaaaaaaa 



25 

gag atg cca 
Glu Met Pro 

cag cag eta 
Gin Gin Leu 

aac gtg gcc 
Asn Val Ala 
75 

ggc ggg agg 
Gly Gly Arg 
90 

gggcgcagga 
gataccatgc 
caaaccctgg 
gatgtggtga 
aa 



ctg cag 
Leu Gin 

45 
ttg ctg 
Leu Leu 
60 

tgg agg 
Trp Arg 

tea ccg 
Ser Pro 

atccgaggca 
caggactctc 
actgactget 
actgaaagaa 



20 <210> 222 

<211> 652 

<212> DNA 

<213> Homo sapiens 

25 <220> 

<221> CDS 
<222> 38. .436 



206 



254 



302 



350 



410 
470 
530 
590 
632 



<400> 222 

30 actgctgtcc cccgagctgc tctacgcgct ggegegg atg 

Met 
1 

c tec 
a Ser 





ate 


gtt 


ctt 


gcg 


gac 


ttg 


aac 


ttc 


ccg 




He 


Val 


Leu 


Ala 


Asp 


Leu 


Asn 


Phe 


Pro 


35 








10 










15 




ggg 


ccc 


atg 


gag 


ate 


cgt 


gca 


gac 


ggc 




Gly 


Pro 


Met 


Glu 


He 


Arg 


Ala 


Asp 


Gly 








25 










30 






gag 


gcc 


gtg 


ctg 


aag 


ctg 


ctg 


ccc 


ctg 


40 


Glu 


Ala 


Val 


Leu 


Lys 


Leu 


Leu 


Pro 


Leu 



45 



50 



55 



60 



ctg ggc 
Leu Gly 



get 

Ala 

55 

acc 

Thr 



40 
gca 



45 



gtc atg gag ctg gtg ccc age gac aag 
Ala Val Met Glu Leu Val Pro Ser Asp Lys 

60 65 
cca gtg tgg acg gag tac gag tec ate eta 
Pro Val Trp Thr Glu Tyr Glu Ser He Leu 
75 80 
gtg aga gcc ctg gca aag ata gag agg ttt gag 
Val Arg Ala Leu Ala Lys He Glu Arg Phe Glu 

90 95 
aag aag get ttt get gtt gtg gca acg ggg gag 
Lys Lys Ala Phe Ala Val Val Ala Thr Gly Glu 

105 110 
aac etc ate etc agg aag ggg gtg ctt gcc etc 
Asn Leu He Leu Arg Lys Gly Val Leu Ala Leu 

120 125 
taggcctggt gaagaccacc tgggccggaa gaggaactgg 
ccaccactca caacaggcct cccagtggca gctcccagac 
taggggcegg cagtcttggg gtgggccctg ccaattggga 
aaatgatgga aaaacgttca aaaaaaaaaa aaaaaa 

<210> 223 
<211> 650 



ggg cac ggg gac gag 
Gly His Gly Asp Glu 
5 

tec ate tgc cag tgt 
Ser He Cys Gin Cys 
20 

ate ccg cag etc ctg 
He Pro Gin Leu Leu 
35 

tat gtg gag agt ccg 
Tyr Val Glu Ser Pro 
50 

gag agg ggc ctg cag 
Glu Arg Gly Leu Gin 
70 

cgc agg gcc ggc tgt 
Arg Arg Ala Gly Cys 
85 

ttt tat gaa egg get 
Phe Tyr Glu Arg Ala 
100 

acg gcc etc tac gga 
Thr Ala Leu Tyr Gly 
115 

aac ccc ctg ctg 
Asn Pro Leu Leu 
130 

gggcaccctg agctccagta 
ctgggccctg gccagggctc 
cgagtatccc tgatttgtga 



55 



103 



151 



199 



247 



295 



343 



391 



436 



496 
556 
616 
652 



220 
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<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> 38.-322 



<400> 223 

actgctgtcc cccgagctgc tctacgcgct ggcgcgg atg ggg cac ggg gac gag 

10 ^ ~ Met Gly His Gly Asp Glu 

1 5 
ate gtt ctt gcg gac ttg aac ttc ccg gec tec tec ate tgc cag tgt 
lie Val Leu Ala Asp Leu Asn Phe Pro Ala Ser Ser lie Cys Gin Cys 
10 15 20 

15 ggg ccc atg gag ate cgt gca gac ggc ctg ggc ate ccg cag etc ctg 
Gly Pro Met Glu lie Arg Ala Asp Gly Leu Gly lie Pro Gin Leu Leu 

25 30 35 

gag gee gtg eta get get gee cct gga cac eta tgt gga gag tec ggc 
Glu Ala Val Leu Ala Ala Ala Pro Gly His Leu Cys Gly Glu Ser Gly 

20 40 45 50 

tgc agt cat gga get ggt gee cag cga caa gga gag ggg cct gca gac 
Cys Ser His Gly Ala Gly Ala Gin Arg Gin Gly Glu Gly Pro Ala Asp 
55 60 65 70 

ccc agt gtg gac gga gta cga gtc cat cct acg cag ggc egg ctg tgt 

25 Pro Ser Val Asp Gly Val Arg Val His Pro Thr Gin Gly Arg Leu Cys 

75 80 85 

gag age cct ggc aaa gat aga gag gtt tgagttttat gaaegggcta 
Glu Ser Pro Gly Lys Asp Arg Glu Val 
90 95 

30 agaaggcttt tgctgttgtg geaaeggggg agacggccct etaeggaaac ctcatcctca 
ggaagggggt gcttgccctc aaccccctgc tgtaggcctg gtgaagacca cctgggccgg 
aagaggaact gggggcaccc tgagctccag taccaccact cacaacaggc ctcccagtgg 
cagctcccag acctgggccc tggecaggge tetaggggee ggcagtcttg gggtgggccc 
tgccaattgg gacgagtatc cctgatttgt gaaaatgatg gaaaaacgtt caaaaaaaaa 

35 aaaaaaaa 



55 



103 



151 



199 



247 



295 



342 



402 
462 
522 
582 
642 
650 



<210> 224 

<211> 502 

<212> DNA 

40 <213> Homo sapiens 

<220> 

<221> CDS 

<222> 202 . .480 



45 



50 



55 



60 



<400> 224 

attctaaggc tacaggtcct ttggcaactg ctccccctc.t 
tctaatatgt gggacatacc aagtgatgag atcctttctg 
ctatagaaca acagatgtag tctccttatt aagtctgaag 
gcagtcaagt cttttctcaa c atg acc cca ate aag 

Met Thr Pro lie Lys 
1 5 
tea aga tat aac ttc aga aga acg ttt gga ata 
Ser Arg Tyr Asn Phe Arg Arg Thr Phe Gly lie 

15 20 
tct tec tat tgc aaa cga gga aat ggc tac aga 
Ser Ser Tyr Cys Lys Arg Gly Asn Gly Tyr Arg 

30 35 
gaa tgc gaa tgc aac tgg ctt cat ctt gaa age 
Glu Cys Glu Cys Asn Trp Leu His Leu Glu Ser 

45 50 
tta ccc ata att tct ccc tct tgg aca tgc aga 
Leu Pro lie lie Ser Pro Ser Trp Thr Cys Arg 

221 



tccttctcct tttttttcaa 
gcatcctaag aaaatgccaa 
accaaacttc ttagtgcaaa 
ctt ttg aac tta aca 
Leu Leu Asn Leu Thr 
10 

gag etc agt tea aac 
Glu Leu Ser Ser Asn 
25 

age aga gtg ccc aaa 
Ser Arg Val Pro Lys 
40 

gac act ctg aag aaa 
Asp Thr Leu Lys Lys 
55 

att ate ctg ttc ttg 
lie lie Leu Phe Leu 



60 
120 
180 
231 



279 



327 



375 



423 
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60 65 70 

tat ttt tct ggc cag ctt etc caa ctt tec ctt tct tgt ttg caa eta 471 
Tyr Phe Ser Gly Gin Leu Leu Gin Leu Ser Leu Ser Cys Leu Gin Leu 
75 80 85 90 

5 att aaa ctt taaggataaa aaaaaaaaaa aa 502 
lie Lys Leu 

<210> 225 
<211> 1739 
10 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
15 <222> 171. .1670 

<400> 225 

actctggcct tgetgettet ctccagctcc tgaacttttc tttcttccat catgetctga 60 
gcccattcct tgaaaactaa aaggtccctg actcccagtc tgcagccatc ctgggcctgc 120 
20 tgagctctga ttcaagtgcc tgcctctgcc ccttggtggg ctgaagcttc atg gag 176 

Met Glu 
1 

gta tec ace aac ccc tec tec aac ate gat cca ggc aac tat gtt gaa 224 

Val Ser Thr Asn Pro Ser Ser Asn lie Asp Pro Gly Asn Tyr Val Glu 

25 5 10 15 

atg aat gat tea ate ace cac eta ccc tct aaa gtg gtg ata caa gat 272 

Met Asn Asp Ser lie Thr His Leu Pro Ser Lys Val Val lie Gin Asp 

20 25 30 

att act atg gag eta cac tgc cct ctg tgc aat gat tgg ttc cga gac 320 

30 lie Thr Met Glu Leu His Cys Pro Leu Cys Asn Asp Trp Phe Arg Asp 

35 40 45 50 

cca ctg atg eta age tgt ggc cac aac ttc tgt gaa gee tgt ate caa 368 

Pro Leu Met Leu Ser Cys Gly His Asn Phe Cys Glu Ala Cys lie Gin 
55 60 65 

35 gac ttt tgg agg ctg caa gca aag gaa aca ttc tgt cct gag tgt aag 416 

Asp Phe Trp Arg Leu Gin Ala Lys Glu Thr Phe Cys Pro Glu Cys Lys 

70 75 80 

atg eta tgt cag tat aac aac tgt aca ttc aac cct gta ctg gac aag 464 

Met Leu Cys Gin Tyr Asn Asn Cys Thr Phe Asn Pro Val Leu Asp Lys 

40 85 90 95 

ttg gta gag aag att aag aag tta ccc tta etc aag ggc cat cca cag 512 

Leu Val Glu Lys lie Lys Lys Leu Pro Leu Leu Lys Gly His Pro Gin 

100 105 110 

tgc cca gag cat gga gag aac ctg aaa ctg ttc agt aaa cca gat ggg 560 

45 Cys Pro Glu His Gly Glu Asn Leu Lys Leu Phe Ser Lys Pro Asp Gly 

115 120 125 130 

aaa ctg ate tgc ttt caa tgc aag gat get egg ttg tct gtg ggg cag 608 

Lys Leu He Cys Phe Gin Cys Lys Asp Ala Arg Leu Ser Val Gly Gin 
135 140 145 

50 tct aag gag ttc ctg caa ate tct gat get gtc cat ttc ttc atg gag 656 

Ser Lys Glu Phe Leu Gin He Ser Asp Ala Val His Phe Phe Met Glu 

150 155 160 

gag ctt gee ate caa cag ggt caa ctg gag aca act ctg aag gag ctt 704 

Glu Leu Ala He Gin Gin Gly Gin Leu Glu Thr Thr Leu Lys Glu Leu 

55 165 170 175 

cag ace ctg agg aac atg cag aag gaa get att get get cac aag gaa 752 

Gin Thr Leu Arg Asn Met Gin Lys Glu Ala He Ala Ala His Lys Glu 

180 185 190 

aac aag eta cat ctg cag caa cat gtg tec atg gag ttt eta aag ctg 800 

60 Asn Lys Leu His Leu Gin Gin His Val Ser Met Glu Phe Leu Lys Leu 

195 200 205 210 

cat cag ttc ctg cac age aaa gaa aag gac att tta act gag etc egg 848 

His Gin Phe Leu His Ser Lys Glu Lys Asp He Leu Thr Glu Leu Arg 
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215 220 225 





gaa 


gag 


ggg 


aaa 


gee 


ttg 


aat 


gag 


gag 


atg 


gag 


ttg 


aat 


ctg 


age 


cag 


896 




Glu 


Glu 


Gly 


Lys 
230 


Ala 


Leu 


Asn 


Glu 


Glu 
235 


Met 


Glu 


Leu 


Asn 


Leu 
24 0 


Ser 


Gin 




5 


ctt 


cag 


gag 


caa 


tgt 


etc 


tta 


gee 


aag 


gat 


atg 


ttg 


gtg 


age 


att 


cag 


944 




Leu 


Gin 


Glu 
245 


Gin 


Cys 


Leu 


Leu 


Ala 
250 


Lys 


Asp 


Met 


Leu 


Val 
255 


Ser 


He 


Gin 






gca 


aag 


acg 


gaa 


caa 


cag 


aac 


tec 


ttc 


gac 


ttt 


etc 


aaa 


gac 


ate 


aca 


992 




Ala 


Lys 


Thr 


Glu 


Gin 


Gin 


Asn 


Ser 


Phe 


Asp 


Phe 


Leu 


Lys 


Asp 


He 


Thr 




10 




260 










265 










270 














act 


etc 


tta 


cat 


age 


ttg 


gag 


caa 


gga 


atg 


aag 


gtg 


ctg 


gca 


acc 


aga 


1040 




Thr 


Leu 


Leu 


His 


Ser 


Leu 


Glu 


Gin 


Gly 


Met 


Lys 


Val 


Leu 


Ala 


Thr 


Arg 






275 










280 










285 










290 






gag 


ctt 


att 


tec 


aga 


aag 


ctg 


aac 


ctg 


ggc 


cag 


tac 


aaa 


ggt 


cct 


ate 


1088 


15 


Glu 


Leu 


He 


Ser 


Arg 
295 


Lys 


Leu 


Asn 


Leu 


Gly 
300 


Gin 


Tyr 


Lys 


Gly 


Pro 
305 


He 






cag 


tac 


atg 


gta 


tgg 


agg 


gaa 


atg 


cag 


gac 


act 


etc 


tgc 


cca 


ggc 


ctg 


1136 




Gin 


Tyr 


Met 


Val 
310 


Trp 


Arg 


Glu 


Met 


Gin 
315 


Asp 


Thr 


Leu 


Cys 


Pro 
320 


Gly 


Leu 




20 


tct 


cca 


eta 


act 


ctg 


gac 


cct 


aaa 


aca 


get 


cac 


cca 


aat 


ctg 


gtg 


etc 


1184 




Ser 


Pro 


Leu 
325 


Thr 


Leu 


Asp 


Pro 


Lys 
330 


Thr 


Ala 


His 


Pro 


Asn 
335 


Leu 


Val 


Leu 






tec 


aaa 


age 


caa 


acc 


age 


gtc 


tgg 


cat 


ggt 


gac 


att 


aag 


aag 


ata 


atg 


1232 




Ser 


Lys 


Ser 


Gin 


Thr 


Ser 


Val 


Trp 


His 


Gly 


Asp 


He 


Lys 


Lys 


He 


Met 




25 




340 










345 










350 














cct 


gat 


gat 


cct 


gag 


agg 


ttt 


gac 


tea 


agt 


gtg 


get 


gta 


ctg 


ggc 


tea 


1280 




Pro 


Asp 


Asp 


Pro 


Glu 


Arg 


Phe 


Asp 


Ser 


Ser 


Val 


Ala 


Val 


Leu 


Gly 


Ser 






355 










360 










365 










370 






aga 


ggc 


ttc 


acc 


tct 


gga 


aag 


tgg 


tac 


tgg 


gaa 


gta 


gaa 


gta 


gca 


aag 


1328 


30 


Arg 


Gly 


Phe 


Thr 


Ser Gly 


Lys 


Trp 


Tyr 


Trp 


Glu 


Val 


Glu 


Val 


Ala 


Lys 














375 










380 










385 








aag 


aca 


aaa 


tgg 


aca 


gtt 


gga 


gtt 


gtc 


aga 


gaa 


tec 


ate 


att 


egg 


aag 


1376 




Lys 


Thr 


Lys 


Trp 
390 


Thr 


Val 


Gly 


Val 


val 
395 


Arg 


Glu 


Ser 


He 


He 
400 


Arg 


Lys 




35 


ggc 


age 


tgt 


cct 


eta 


act 


cct 


gag 


caa 


gga 


ttc 


tgg 


ctt 


tta 


aga 


eta 


1424 




Gly 


Ser 


Cys 
405 


Pro 


Leu 


Thr 


Pro 


Glu 
410 


Gin 


Gly 


Phe 


Trp 


Leu 
415 


Leu 


Arg 


Leu 






agg 


aac 


caa 


act 


gat 


eta 


aag 


get 


ctg 


gat 


ttg 


cct 


tct 


ttc 


agt 


ctg 


1472 




Arg 


Asn 


Gin 


Thr 


Asp 


Leu 


Lys 


Ala 


Leu 


Asp 


Leu 


Pro 


Ser 


Phe 


Ser 


Leu 




40 




420 










425 










430 














aca 


ctg 


act 


aac 


aac 


etc 


gac 


aag 


gtg 


ggc 


ata 


tac 


ctg 


gat 


tat 


gaa 


1520 




Thr 


Leu 


Thr 


Asn 


Asn 


Leu 


Asp 


Lys 


Val 


Gly 


He 


Tyr 


Leu 


Asp 


Tyr 


Glu 






435 










440 










445 










450 






gga 


gga 


cag 


ttg 


tec 


ttc 


tac 


aat 


get 


aaa 


acc 


atg 


act 


cac 


att 


tac 


1568 


45 


Gly 


Gly 


Gin 


Leu 


Ser 
455 


Phe 


Tyr 


Asn 


Ala 


Lys 
4 6 0 


Thr 


Met 


Thr 


His 


He 
465 


Tyr 






acc 


ttc 


agt 


aac 


act 


ttc 


atg 


gag 


aaa 


ctt 


tat 


ccc 


tac 


ttc 


tgc 


ccc 


1616 




Thr 


Phe 


Ser 


Asn 
470 


Thr 


Phe 


Met 


Glu 


Lys 
475 


Leu 


Tyr 


Pro 


Tyr 


Phe 
480 


Cys 


Pro 




50 


tgc" ctt 


aat 


gat 


ggt 


aga 


gag 


aat 


aaa 


gaa 


cca 


ttg 


cac 


ate 


tta 


cat 


1664 




Cys 


Leu 


Asn 


Asp 


Gly Arg 


Glu 


Asn 


Lys 


Glu 


Pro 


Leu 


His 


He 


Leu 


His 





485 490 495 



cca cag taatgagtca taatattata caaattcaga gtgttattaa agaggttttg 1720 
Pro Gin 
55 500 

aaataaaaaa aaaaaaaaa 1739 



<210> 226 

<211> 657 

60 <212> DNA 

<213> Homo sapiens 



<220> 

223 
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<221> CDS 
<222> 199 . . 618 

<400> 226 

5 aactggatag agtactgccc ccttcagccc atggagaaag gcaaatgcct ccttcagagt 60 
ctacctaatg ctttctcaga taaataagca tgaagaaaag tcaaagtcca ttctagctct 120 
aaaataagga atgaaatgtt ttcctgatat gattttttgt tttcatctga taataatttt 180 
atatatcaca gaaacagc atg gtt ctt act aaa cct ctt caa aga aat ggc 231 

Met Val Leu Thr Lys Pro Leu Gin Arg Asn Gly 
10 15 10 

age atg atg age ttt gaa aat gtg aaa gaa aag age aga gaa gga ggg 279 
Ser Met Met Ser Phe Glu Asn Val Lys Glu Lys Ser Arg Glu Gly Gly 

15 20 25 

ccc cat gca cac aca ccc gaa gaa gaa ttg tgt ttc gtg gta aca cac 327 
15 Pro His Ala His Thr Pro Glu Glu Glu Leu Cys Phe Val Val Thr His 
30 35 40 

tac cct cag gtt cag acc aca etc aac ctg ttt ttc cat ata ttc aag 375 
Tyr Pro Gin Val Gin Thr Thr Leu Asn Leu Phe Phe His lie Phe Lys 
45 50 55 

20 gtt ctt act caa cca ctt tec ctt ctg tgg ggt tgt gat cag aag cct 423 
Val Leu Thr Gin Pro Leu Ser Leu Leu Trp Gly Cys Asp Gin Lys Pro 
60 65 70 75 

cgt act gtt cct acc ctt gga aac ggc gca tgg gat acc tgc caa caa 471 
Arg Thr Val Pro Thr Leu Gly Asn Gly Ala Trp Asp Thr Cys Gin Gin 
25 80 85 90 

cac ata cgc act tea tea tgg aca gca aac aca etc gtc att caa aac 519 
His lie Arg Thr Ser Ser Trp Thr Ala Asn Thr Leu Val lie Gin Asn 

95 100 105 

cag cat tea egg gaa age act gtt tct gtt tgc ctt ttt atg tta ate 567 
30 Gin His Ser Arg Glu Ser Thr Val Ser Val Cys Leu Phe Met Leu lie 
110 ' 115 120 

cgc atg caa cat att ttg aaa aca gat aca ctt caa cag ttc aga ata 615 
Arg Met Gin His lie Leu Lys Thr Asp Thr Leu Gin Gin Phe Arg lie 
125 130 135 

35 tgc tagtactaat aaaaccaaca tgttaaaaaa aaaaaaaaa 657 
Cys 
140 

<210> 227 
40 <211> 888 
<212> DNA 
<213> Homo sapiens 

<220> 
45 <221> CDS 

<222> 182 . .481 

<400> 227 

attttgeetc tcagtgttca agettgagee cacgcatcca actcctgaga tcttactggg 60 
50 aagctgctga tcatcagttt caggaagtca gcatggatca gecttaegtt catggcctcc 120 
aggecctatt ctcctgcctc acagggaccg gecaggatet ctatccttac agcacgttgg 180 
a atg tat atg etc etc tec cca cat cgc ctt agg gag cag gca ggt gtc 229 
Met Tyr Met Leu Leu Ser Pro His Arg Leu Arg Glu Gin Ala Gly Val 
15 10 15 

55 agg ggc age ata agg acg gee aac agg aca gaa gac ggg ttg aag ate 277 
Arg Gly Ser lie Arg Thr Ala Asn Arg Thr Glu Asp Gly Leu Lys lie 

20 25 30 

cga gag get gag tea ctt cca caa agt aac aca get gat ttt aaa tgc 325 
Arg Glu Ala Glu Ser Leu Pro Gin Ser Asn Thr Ala Asp Phe Lys Cys 
60 35 40 45 

ctg cat tea gca tec ctg cag cag get cca ggt gga att eta atg gga 373 
Leu His Ser Ala Ser Leu Gin Gin Ala Pro Gly Gly lie Leu Met Gly 
50 55 60 

224 
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10 



15 



cca gcc tec agt ccc tgg acc tta gec gtg gaa gga gag aag agg aca 
Pro Ala Ser Ser Pro Trp Thr Leu Ala Val Glu Gly Glu Lys Arg Thr 
65 70 75 80 

tct gca cct cct etc aga gaa age ctg atg cct act aaa gga ctt ggg 
Ser Ala Pro Pro Leu Arg Glu Ser Leu Met Pro Thr Lys Gly Leu Gly 

85 90 95 

tgg tgg acg cag tgaccctcag tctggagctt gttcactgaa cattggagac 
Trp Trp Thr Gin 
100 

tatcatttgc gcagatggtc ttgggcctct atgagcagca ggctgcaccc cacagtgacc 
tcctcattct actctgaggc atcttcatga aagcagatgt ccattgaaaa gcacccaagt 
gcagtctcag ctgatgaact teagaggega ttgagacaaa ggctctcggt cccctctgcc 
cttggatggt gcctctggta tgcacttggc ctctgtgtct ttatttagac tggtcacttc 
acaacccatc atgtcacccc acccctaacc gtgcccactc tgggtcctcc cctcaactgc 
ctgacttccc actttgagct cagcaaaggc aatagatgtt ttgtctgctt cgaaaaaaaa 
aaaaaaa 



421 



469 



521 



581 
641 
701 
761 
821 
881 
888 



<210> 228 

<211> 716 

20 <212> DNA 

<213> Homo sapiens 



25 



<220> 
<221> CDS 
<222> 161. .517 



<400> 228 

acctgtcatt atgettacta acgttcggga cgtctcccgg gctgcttggg cgaggagagg 
caggggtgtg tgaccccggt ggttactgtg etcgegtaga gcacctaggg ectgetgaag 
30 ccctccctcg cccgcgcctc tccttagtcc ttgagatgag atg gca agt tac age 

Met Ala Ser Tyr Ser 
1 5 
ggc ttc tec ggc ctg ctg gag att cgc tac ggg cca gga cac cgc age 
Gly Phe Ser Gly Leu Leu Glu lie Arg Tyr Gly Pro Gly His Arg Ser 
35 10 15 20 

tgc ctt ccc caa ttc get ttc ttt ccg cag ccg ccg ctg ccc cga ccc 
Cys Leu Pro Gin Phe Ala Phe Phe Pro Gin Pro Pro Leu Pro Arg Pro 

25 30 35 

egg ate tgc atg tgg gtg ctg get gag ctg ctg gag eta ggg tgt cct 
40 Arg lie Cys Met Trp Val Leu Ala Glu Leu Leu Glu Leu Gly Cys Pro 
40 45 50 

gag cag age ctg agg gac gcc ate acc ctg gac etc ttc tgc cac gcg 
Glu Gin Ser Leu Arg Asp Ala lie Thr Leu Asp Leu Phe Cys His Ala 
55 60 65 

45 etc att ttc tgc cgc cag cag ggc ttc tea ctg gag cag acg tea gcg 
Leu lie Phe Cys Arg Gin Gin Gly Phe Ser Leu Glu Gin Thr Ser Ala 
70 75 80 85 

get tgt gcc ctg etc cag gat ctt cac aag get tgt att ggt gag agg 
Ala Cys Ala Leu Leu Gin Asp Leu His Lys Ala Cys lie Gly Glu Arg 
50 90 95 100 

ggg cag eta cca ggt ttg age ccc agg gag aag agg aac egg gcc tgg 
Gly Gin Leu Pro Gly Leu Ser Pro Arg Glu Lys Arg Asn Arg Ala Trp 

105 110 115 

cac aag tgaccatggg aagcagaagc aggggatttc tgcctggaat atgtcattat 
55 His Lys 

tagtagcatc atcatacaca agccatcagc tttccaatcc actgcttcct tatctagaaa 
ttaaggatac agcacacatt ttacaggact gttctgagaa ataatatatg caaatatatg 
catagtgeae aataaaaaaa aaaaaaaaa 

60 <210> 229 
<211> 654 
<212> DNA 
<213> Homo sapiens 

225 



60 
120 
175 



223 



271 



319 



367 



415 



463 



511 



567 

627 
687 
716 
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<220> 
<221> CDS 
<222> 86 . .505 

5 

<400> 229 

agttcgcggt gtcagcctcc gcctccgagc ctcagttgtc ttctctgtga ggtgggaatg 60 
ccggtgaatc ctgccgctgg cgtgg atg aga agt gaa tgc gtg etc gga get 112 

Met Arg Ser Glu Cys Val Leu Gly Ala 
10 15 

gcg agt gac age ggg cag gag gcg ccc agg gac act tgg ttt etc cag 160 
Ala Ser Asp Ser Gly Gin Glu Ala Pro Arg Asp Thr Trp Phe Leu Gin 
10 15 20 25 

ggc tgg aag get tct aga agg ttc etc ate aag gga agt gtg get ggg 208 
15 Gly Trp Lys Ala Ser Arg Arg Phe Leu lie Lys Gly Ser Val Ala Gly 

30 35 40 

ggc gee gtc tac ctg gtg tac gac cag gag ctg ctg ggg ccc age gac 256 
Gly Ala Val Tyr Leu Val Tyr Asp Gin Glu Leu Leu Gly Pro Ser Asp 
45 50 55 

20 aag age cag gca gee eta cag aag get ggg gag gtg gtc ccc ccc gee 304 
Lys Ser Gin Ala Ala Leu Gin Lys Ala Gly Glu Val Val Pro Pro Ala 

60 65 70 

atg tac cag ttc age cag tac gtg tgt cag cag aca ggc ctg cag ata 352 
Met Tyr Gin Phe Ser Gin Tyr Val Cys Gin Gin Thr Gly Leu Gin lie 
25 75 80 85 

ccc cag etc cca gee cct cca aag att tac ttt ccc ate cgt gac tec 400 
Pro Gin Leu Pro Ala Pro Pro Lys lie Tyr Phe Pro lie Arg Asp Ser 
90 95 100 105 

tgg aat gca ggc ate atg acg gtg atg tea get ctg teg gtg gee ccc 44 8 

30 Trp Asn Ala Gly lie Met Thr Val Met Ser Ala Leu Ser Val Ala Pro 

110 115 120 

tec aag gee cgc gag tac tec aag gag ggc tgg gag tat gtg aag gcg 4 96 

Ser Lys Ala Arg Glu Tyr Ser Lys Glu Gly Trp Glu Tyr Val Lys Ala 
125 130 135 

35 cgc ace aag tagegagtea geaggggecg cctgccccgg ecagaaeggg 54 5 

Arg Thr Lys 
140 

cagggctgcc actgacctga agactcegga ctgggacccc actccgaggg cagctcccgg 605 
ccttgccggc ccaataaagg acttcagaag tcaaaaaaaa aaaaaaaaa 654 

40 

<210> 230 
<211> 635 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 56 . .382 

50 <400> 230 

aattcgggtg gagctgagee ggagacaggc agttgtgaaa aacttcagga caaaa atg 58 

Met 
1 

ttt cat tta agg act tgt get get aag ttg agg cca ttg acg get tec 106 
55 Phe His Leu Arg Thr Cys Ala Ala Lys Leu Arg Pro Leu Thr Ala Ser 
5 10 15 

cag act gtt aag aca ttt tea caa aac aga cca gca gca get agg aca 154 
Gin Thr Val Lys Thr Phe Ser Gin Asn Arg Pro Ala Ala Ala Arg Thr 
20 25 30 

60 ttt caa cag att egg tgc tat tct gca cct gtt get get gag ccc ttt 202 
Phe Gin Gin lie Arg Cys Tyr Ser Ala Pro Val Ala Ala Glu Pro Phe 

35 40 45 

etc agt ggg act agt teg aac tat gtg gag gag atg tac tgt get tgg 250 

226 



45 
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10 



15 



Leu Ser Gly Thr Ser Ser Asn Tyr Val Glu Glu 
50 55„ 60 

ctg gaa aac ccc aaa agt gta cat aag aca ggg 
Leu Glu Asn Pro Lys Ser Val His Lys Thr Gly 

70 75 
ggc tgg agt gca gtg gcg gga tct egg ctt get 
Gly Trp Ser Ala Val Ala Gly Ser Arg Leu Ala 

85 90 
tgg gtt caa gtg att ctt atg cct cag cct ccc 
Trp Val Gin Val lie Leu Met Pro Gin Pro Pro 

100 105 
tacaggtgea cgtcaccacg cctgactagt ttttgtattt 
ctttgttggc caggctggtc ttgaacccct ggcctcaagt 
ecaaagtget gggattacag gtatgatcaa ccacgcctgg 
gaattccttt attctgtttt gagecaataa atatttatag 
aaa 



Met Tyr Cys Ala Trp 
65 

tec cac tgt tgt cca 
Ser His Cys Cys Pro 
80 

gca ace tec gac tec 
Ala Thr Ser Asp Ser 
95 

gag taactgggac 
Glu 

ttagtagaga tgggatttta 
gatccaccca ccttggcctc 
ccatgtcatg ccttgtgaca 
gtttcgaaaa aaaaaaaaaa 



298 



346 



392 



452 
512 
572 
632 
635 



<210> 231 

<211> 634 

20 <212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

25 <222> 56. .355 



<400> 231 

aattcgggtg gagctgagee ggagacaggc agttgtgaaa aacttcagga caaaa atg 

Met 

30 l 
ttt cat tta agg act tgt get get aag ttg agg cca ttg acg get tec 
Phe His Leu Arg Thr Cys Ala Ala Lys Leu Arg Pro Leu Thr Ala Ser 

5 10 15 

cag act gtt aag aca ttt tea caa aac aga cca gca gca get agg aca 
35 Gin Thr Val Lys Thr Phe Ser Gin Asn Arg Pro Ala Ala Ala Arg Thr 
20 25 30 

ttt caa cag att cgt get att ctg cac ctg ttg ctg ctg age cct ttc 
Phe Gin Gin lie Arg Ala lie Leu His Leu Leu Leu Leu Ser Pro Phe 
35 40 45 

40 tea gtg gga eta gtt cga act atg tgg agg aga tgt act gtg ctt ggc 
Ser Val Gly Leu Val Arg Thr Met Trp Arg Arg Cys Thr Val Leu Gly 
50 55 60 65 

tgg aaa ace cca aaa gtg tac ata aga cag ggt ccc act gtt gtc cag 
Trp Lys Thr Pro Lys Val Tyr lie Arg Gin Gly Pro Thr Val Val Gin 
45 70 75 80 

get gga gtg cag tgg egg gat etc ggc ttg ctg caa cct ccg act cct 
Ala Gly Val Gin Trp Arg Asp Leu Gly Leu Leu Gin Pro Pro Thr Pro 

85 90 95 

ggg ttc aag tgattcttat gcctcagcct cccgagtaac tgggactaca 
50 Gly Phe Lys 
100 

ggtgcacgtc accacgcctg actagttttt gtatttttag tagagatggg attttacttt 
gttggccagg ctggtcttga acccctggcc tcaagtgatc cacccacctt ggcctcccaa 
agtgctggga ttacaggtat gatcaaccac gcctggccat gtcatgeett gtgacagaat 
55 tcctttattc tgttttgagc caataaatat ttataggttt cgaaaaaaaa aaaaaaaaa 



58 



106 



154 



202 



250 



298 



346 



3 95 



455 
515 
575 
634 



<210> 232 
<211> 583 
<212> DNA 
60 <213> Homo sapiens 

<220> 
<221> CDS 
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<222> 76 . .498 
<400> 232 

aatatagcca gccgcggctg cccttgcgct tcccgagctg gcggggtccg tggtgcggga 60 
5 tcgagattgc gggct atg gcg ccg aag gtt ttt cgt cag tac tgg gat ate ill 

Met Ala Pro Lys Val Phe Arg Gin Tyr Trp Asp lie 
15 10 
ccc gat ggc acc gat tgc cac cgc aaa gec tac age ace ace agt att 159 
Pro Asp Gly Thr Asp Cys His Arg Lys Ala Tyr Ser Thr Thr Ser He 
10 15 20 25 

gee age gtc get ggc ctg acc gee get gec tac aga gtc aca etc aat 207 
Ala Ser Val Ala Gly Leu Thr Ala Ala Ala Tyr Arg Val Thr Leu Asn 

30 35 40 

cct ccg ggc acc ttc ctt gaa gga gtg get aag gtt gga caa tac acg 255 
15 Pro Pro Gly Thr Phe Leu Glu Gly Val Ala Lys Val Gly Gin Tyr Thr 
45 50 55 60 

ttc act gca get get gtc ggg gee gtg ttt ggc etc acc acc tgc ate 303 
Phe Thr Ala Ala Ala Val Gly Ala Val Phe Gly Leu Thr Thr Cys He 
65 70 75 

20 age gee cat gtc cgc gag aag ccc gac gac ccc ctg aac tac ttc etc 351 
Ser Ala His Val Arg Glu Lys Pro Asp Asp Pro Leu Asn Tyr Phe Leu 

80 85 90 

ggt ggc tgc gee gga ggc ctg act ctg gga gca cgc acg cac aac tac 3 99 

Gly Gly Cys Ala Gly Gly Leu Thr Leu Gly Ala Arg Thr His Asn Tyr 
25 95 100 105 

ggg att ggc gee gee gee tgc gtg tac ttt ggc ata gcg gee tec ctg 447 
Gly He Gly Ala Ala Ala Cys Val Tyr Phe Gly He Ala Ala Ser Leu 

110 115 120 

gtc aag atg ggc egg ctg gag ggc tgg gag gtg ttt gca aaa ccc aag 4 95 

30 Val Lys Met Gly Arg Leu Glu Gly Trp Glu Val Phe Ala Lys Pro Lys 
125 130 135 140 

gtg tgagccctgt gcctgccggg acctccagcc tgcagaatgc gtccagaaat 54 8 

Val 

aaattctgtg tctgtgtgaa aaaaaaaaaa aaaaa 583 

35 

<210> 233 
<211> 753 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
<222> 199 . . 600 

45 <400> 233 

atttttccga tgccaggcac cctcaaggca cagaggctgg ggctcatgtt gggggcactt 60 

ggcctctcca ggcctcgaag gcttctctgg getgatgega gctggggaac gggagggacg 12 0 

gacgtgggag egagaaegtc acactggagg cagctggtgg cacgatgggg gacagagtga 18 0 

aagagectte gtgtcacc atg gee aca cac ccc gat ggc ttc egg ctt gag 231 

50 Met Ala Thr His Pro Asp Gly Phe Arg Leu Glu 

15 10 

gga ccc ctg get gca gee cac age cct ggg cct tgc act gtg etc tac 27 9 

Gly Pro Leu Ala Ala Ala His Ser Pro Gly Pro Cys Thr Val Leu Tyr 
15 20 25 

55 gaa ggc cct gtc cgt ggg etc tgc ccy ttt gee ccg cga aat tec aac 327 

Glu Gly Pro Val Arg Gly Leu Cys Pro Phe Ala Pro Arg Asn Ser Asn 

30 35 40 

acc atg gcg gcg get gec ctg get gec ccc age ctg ggc ttc gat ggg 375 

Thr Met Ala Ala Ala Ala Leu Ala Ala Pro Ser Leu Gly Phe Asp Gly 

60 45 50 55 

gtg att ggg gtg etc gtg get gat acc age etc acg gac atg cac gtg 423 

Val He Gly Val Leu Val Ala Asp Thr Ser Leu Thr Asp Met His Val 
60 65 70 75 

228 
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<210> 235 
<211> 537 
<212> DNA 



229 



gtg gat gta gag ctg age gga ccc egg ggc ccc act ggc cga age ttt 471 
Val Asp Val Glu Leu Ser Gly Pro Arg Gly Pro Thr Gly Arg Ser Phe 

80 85 90 

get gtg cac ace cgc aga gag aac cct gee gag cca ggc gcg gtc ace 519 
5 Ala Val His Thr Arg Arg Glu Asn Pro Ala Glu Pro Gly Ala Val Thr 
95 100 105 

ggc tec gee acc gtc acg gee ttc tgg egg age etc ctg gec tgc tgc 567 
Gly Ser Ala Thr Val Thr Ala Phe Trp Arg Ser Leu Leu Ala Cys Cys 
110 115 120 

10 cag etc ccc tec agg ccg ggg ate cat etc tgc tgagaagect cctccctccc 620 
Gin Leu Pro Ser Arg Pro Gly He His Leu Cys 

125 130 
gagacaagat catctgcctg gcctctcacc accaccatcc cacccctgcc ctgccccact 680 
tccccagggt ctcccttctg actcagtaaa gatcaccgct gcctcccccc gcaaataaaa 740 
IS aaaaaaaaaa aaa 753 

<210> 234 
<211> 762 
<212> DNA 
20 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 211 . . 612 

25 

<400> 234 

atttccgatg ccaggcaccc tcaaggcaca gaggctgggg ctcatgttgg gggcacttgg 60 
cctctccagg cctcgaaggc ttctcctggg ctgatgegag ctggggaacg ggagggaegg 120 
acgtgggagc gagaaegtea cactggaggc agctggtggc acgatggggg acagagtgaa 180 
30 aggtagcaag teaagagect tcgtgtcacc atg gee aca cac ccc gat ggc ttc 234 

Met Ala Thr His Pro Asp Gly Phe 
1 5 
egg ctt gag gga ccc ctg get gca gcg cac age cct ggg cct tgc act 
Arg Leu Glu Gly Pro Leu Ala Ala Ala His Ser Pro Gly Pro Cys Thr 
35 ~ 10 15 20 

gtg etc tac gaa ggc cct gtc cgt ggg etc tgc ccc ttt gee ccg cga 
Val Leu Tyr Glu Gly Pro Val Arg Gly Leu Cys Pro Phe Ala Pro Arg 
25 30 35 40 

aat tec aac acc atg teg gcg get gee ctg get gee ccc age ctg ggc 378 
40 Asn Ser Asn Thr Met Ser Ala Ala Ala Leu Ala Ala Pro Ser Leu Gly 

45 50 55 

ttc gat ggg gtg att ggg gtg etc gtg get gat acc age etc acg gac 
Phe Asp Gly Val He Gly Val Leu Val Ala Asp Thr Ser Leu Thr Asp 
60 65 70 

45 atg cac gtg gtg gat gta gag ctg age gga ccc egg ggc ccc acg tgc 474 
Met His Val Val Asp Val Glu Leu Ser Gly Pro Arg Gly Pro Thr Cys 

75 80 85 

cga age ttt get gtg cac acc cgc aga gag aac cct gee gag cca ggc 522 
Arg Ser Phe Ala Val His Thr Arg Arg Glu Asn Pro Ala Glu Pro Gly 
50 90 95 100 

gcg gtc acc ggc tec gee acc gtc acg gec ttc tgg egg age etc ctg 570 
Ala Val Thr Gly Ser Ala Thr Val Thr Ala Phe Trp Arg Ser Leu Leu 
105 110 115 120 

gec tgc tgc cag etc ccc tec agg ccg ggg ate cat etc tgc 612 
55 Ala Cys Cys Gin Leu Pro Ser Arg Pro Gly He His Leu Cys 

125 130 
tgagaagect cctccctccc gagacaagat catctgcctg gcctctcacc accaccatcc 672 
cacccctgcc ctgccccact tccccagggt ctcccttctg actcagtaaa gatcaccgct 732 
gcctcccccc gecaaaaaaa aaaaaaaaaa 762 



282 



330 



426 
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<213> Homo sapiens 

<220> 
<221> CDS 
5 <222> 5 . .259 

<400> 235 

aaaa atg eta aag gta gaa gca act ggt agt ccc gag gaa ggg tgg gcg 49 
Met Leu Lys Val Glu Ala Thr Gly Ser Pro Glu Glu Gly Trp Ala 
10 1 5 10 15 

ggt gga gag ccc egg act gga get cct gcg aac tec cct tec tgc cct 97 
Gly Gly Glu Pro Arg Thr Gly Ala Pro Ala Asn Ser Pro Ser Cys Pro 

20 25 30 

cag gag atg cca ctg cag gac cca agg age agg gag gag gcg gee agg 145 
15 Gin Glu Met Pro Leu Gin Asp Pro Arg Ser Arg Glu Glu Ala Ala Arg 
35 40 45 

ace cag cag eta ttg ctg gee act ctg cag gag gca gcg ace acg cag 193 
Thr Gin Gin Leu Leu Leu Ala Thr Leu Gin Glu Ala Ala Thr Thr Gin 
50 55 60 

20 gag aac gtg gee tgg agg aag aac tgg atg gtt ggc ggc gaa ggc ggc 241 
Glu Asn Val Ala Trp Arg Lys Asn Trp Met Val Gly Gly Glu Gly Gly 

65 70 75 

gee age ggg agg tea ccg tgagaccgga cttgcctccg tgggegcegg 289 
Ala Ser Gly Arg Ser Pro 
25 80 85 

accttggctt gggcgcagga atccgaggca gcctttctcc ttcgtgggcc cageggagag 349 
tccggaccga gataccatgc caggactctc eggggtcctg tgagctgccg tegggtgage 4 09 
acgtttcccc caaaccctgg actgactget ttaaggtccg caaggeggge cagggecgag 469 
aegegagteg gatgtggtga actgaaagaa ccaataaaat catgttcctc caaaaaaaaa 529 
30 aaaaaaaa 537 

<210> 236 
<211> 994 
<212> DNA 
35 <213> Homo sapiens 

<220> 
<221> CDS 
<222> 23 . . 370 

40 

<400> 236 

gattgctgtt tgctgtaaag tg atg ggg agg ccc tgg atg gtg atg ata ttg 52 

Met Gly Arg Pro Trp Met Val Met lie Leu 
15 10 
45 gag tea aaa tct gaa gaa aag atg tgg tat ggt gta ttc ctg tgg gca 100 
Glu Ser Lys Ser Glu Glu Lys Met Trp Tyr Gly Val Phe Leu Trp Ala 

15 20 25 

ctg gtg tct tct etc ttc ttt cat gtc cct get gga tta ctg gee etc 148 
Leu Val Ser Ser Leu Phe Phe His Val Pro Ala Gly Leu Leu Ala Leu 
50 30 35 40 

ttc ace etc aga cat cac aaa tat ggt agg ttc atg tct gta age ate 196 
Phe Thr Leu Arg His His Lys Tyr Gly Arg. Phe Met Ser Val Ser lie 

45 50 55 

ctg ttg atg ggc ate gtg gga cca att act get gga ate ttg aca agt 244 
55 Leu Leu Met Gly lie Val Gly Pro lie Thr Ala Gly lie Leu Thr Ser 
60 65 70 

gca get att get gga gtt tac cga gca gca ggg aag gaa atg ata cca 292 
Ala Ala lie Ala Gly Val Tyr Arg Ala Ala Gly Lys Glu Met lie Pro 
75 80 85 90 

60 ttt gaa gee etc aca ctg ggc act gga cag aca ttt tgc gtc ttg gtg 340 
Phe Glu Ala Leu Thr Leu Gly Thr Gly Gin Thr Phe Cys Val Leu Val 

95 100 105 

gtc tec ttt tta egg att tta get act eta tagcatacat ecttatgetg 390 

230 
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Val Ser Phe 

agatgttgaa 
ttcttagttc 
cagccttttc 
tcaaccaaaa 
tgtgtacctt 
atttacagta 
tttcaaatga 
aactcttttc 
agttttgaaa 
agtggctttg 
aaaa 



Leu Arg lie Leu Ala Thr Leu 

110 115 

cttaaacttt atggaatcct ccaaaagaat acattatgga gtgtagtgtt 450 

ttcaaaggga agcaacttgg atgaacagga acatgaagga caacacatct 510 

ttcattttga agctcctaga attgaagact tatgtggact cctattgttc 570 

caagtctttt ggctttcttt tttgtagata tttaatttaa gcagttttca 630 

tacccaagcc aagtcaacag tgtctctggg gtggcatcct ttgcactgaa 690 

ttctgtgaga tgtcgcatat tttgaagaaa ccgtggaaga tactggttta 750 

gcagagtatg ttgtattaaa atcttatcta atcttgatta aaatttggca 810 

tttgctacat cttagtgaca ataaatgcca aataggtttt ggttgagtat 870 

acaaatttgg tgaaataaag caggaaaaaa aatttaagta taactcaagt 930 

gttccactgt ttataaataa aaagtagata acaatggaaa aaaaaaaaaa 990 

994 



15 <210> 237 

<211> 662 

<212> DNA 

<213> Homo sapiens 

20 <220> 

<221> CDS 

<222> 41 . .352 



<400> 237 

25 tagctaaaaa ttgagggttc taaatactaa ggaagaaggg 



30 



35 



40 



45 



50 



ggc aag ata 
Gly Lys lie 

acc tec ttc 
Thr Ser Phe 

aaa gtc ctt 
Lys Val Leu 
40 

gac ctg etc 
Asp Leu Leu 
55 

tec cat ttg 
Ser His Leu 
70 

cat gat cct 
His Asp Pro 

ctg act acc 

Leu Thr Thr 

aatcccagca 

cctggctggc 

ggcgggcgcc 

ggaggeggag 

ccagattccg 



ttt gtc tct 
Phe Val Ser 
10 

ccc agg cag 
Pro Arg Gin 
25 

tgt tea gga 
Cys Ser Gly 

aca cgc cac 
Thr Arg His 

att etc aca 
lie Leu Thr 
75 

cat ttc aca 
His Phe Thr 
90 

taaaattgee 



gtc atg gtt aaa ttg 

Val Met Val Lys Leu 
15 

cca ttg tta aca ttt 

Pro Leu Leu Thr Phe 
30 

tta ttt tec cac tct 

Leu Phe Ser His Ser 
45 

cct tat gaa act gee 

Pro Tyr Glu Thr Ala 
60 

gaa get eta cga aat 

Glu Ala Leu Arg Asn 
80 

999 9 aa 9 aa act 9 a 9 

Gly Glu Glu Thr Glu 
95 

atgtaggccg gcgcggtgg 



atg aat aga 
Met Asn Arg 
1 

caa aaa aat 
Gin Lys Asn 



ttt gaa 
Phe Glu 

gee aag 
Ala Lys 

50 
gcg cca 
Ala Pro 
65 

ggg ttg 
Gly Leu 



tat 

Tyr 

35 

agt 

Ser 

ctt 
Leu 

ggc 
Gly 



gee cag agg 
Ala Gin Arg 

c tcacgcctgt 



tat tgt 
Tyr Cys 
5 

aaa ctt 
Lys Leu 
20 

eta gaa 
Leu Glu 

cac cat 
His His 

etc age 
Leu Ser 

aaa tgt 
Lys Cys 

85 
ggg aaa 
Gly Lys 
100 



ctgtgggagg ecaaggeggg tggatcgega 
acttgaagee ccgtctctac tagggataca 
tgtgkwccca gctgttcggc aggctgagga 
cttgcggtgg geegggattg cgccactgca 
teeaaaaaaa aaaaaaaaaa 



ggtcaggaga tcgagaccat 
aataattggc cgggtgtggt 
gggcgaatgg tgtgagcctg 
ctccagcctg ggegacagag 



55 



103 



151 



199 



247 



295 



343 



392 

452 
512 
572 
632 
662 



<210> 238 

55 <211> 1829 

<212> DNA 

<213> Homo sapiens 

<220> 

60 <221> CDS 

<222> 3 . . 1319 

<400> 238 



231 
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at eta ggt gac cat gga tgg gag ctg age ttg gag gag gac gca cag 
Leu Gly Asp His Gly Trp Glu Leu Ser Leu Glu Glu Asp Ala Gin 
1 5 10 15 





ctg 


tgg 


ggt 


ggg 


gtg 


gtg 


aag 


agt 


tgt 


ttt 


gag 


gga 


aaa 


ggc 


cca 


caa 


95 


5 


Leu 


Trp 


Gly Gly 


Val 


Val 


Lys 


Ser 


Cys 


Phe 


Glu 


Gly 


Lys 


Gly 


Pro 


Gin 














20 










25 










30 








aga 


gaa 


gec 


caa 


cca 


gee 


age 


ccc 


cag 


gee 


gee 


ccg 


cca 


gga 


ccc 


acc 


143 




Arg 


Glu 


Ala 


Gin 


Pro 


Ala 


Ser 


Pro 


Gin 


Ala 


Ala 


Pro 


Pro 


Gly 


Pro 


Thr 










35 










40 










45 








10 


aat 


gag 


gca 


cag 


atg 


gca 


gec 


get 


gee 


gee 


eta 


gee 


egg 


ctg 


gag 


cag 


191 




Asn 


Glu 


Ala 
50 


Gin 


Met 


Ala 


Ala 


Ala 
55 


Ala 


Ala 


Leu 


Ala 


Arg 
60 


Leu 


Glu 


Gin 






aag 


cag 


tec 


egg 


gee 


tgg 


ggc 


ccc 


aca 


teg 


cag 


gac 


acc 


ate 


cga 


aac 


239 




Lys 


Gin 


Ser 


Arg 


Ala 


Trp 


Gly 


Pro 


Thr 


Ser 


Gin 


Asp 


Thr 


He 


Arg 


Asn 




15 




65 










70 










75 














cag 


gtg 


aga 


aag 


gaa 


ctt 


caa 


gee 


gaa 


gee 


acc 


gtc 


age 


ggg 


age 


ccc 


287 




Gin 


Val 


Arg 


Lys 


Glu 


Leu 


Gin 


Ala 


Glu 


Ala 


Thr 


Val 


Ser 


Gly 


Ser 


Pro 






80 










85 










90 










95 






gag 


gec 


cca 


ggg 


acc 


aac 


gtg 


gta 


tct 


gag 


ccc 


aga 


gag 


gaa 


ggc 


tct 


335 


20 


Glu 


Ala 


Pro 


Gly 


Thr 
100 


Asn 


Val 


val 


Ser 


Glu 
10 5 


Pro 


Arg 


Glu 


Glu 


Gly 
110 


Ser 






gec 


cac 


ctg 


get 


gtg 


cct 


ggc 


gtg 


tac 


ttc 


acc 


tgt 


ccg 


etc 


act 


ggg 


383 




Ala 


His 


Leu 


Ala 
115 


Val 


Pro 


Gly 


Val 


Tyr 
120 


Phe 


Thr 


Cys 


Pro 


Leu 
125 


Thr 


Gly 




25 


gec 


ace 


ctg 


agg 


aag 


gac 


cag 


egg 


gac 


gee 


tgc 


ate 


aag 


gag 


gee 


att 


431 




Ala 


Thr 


Leu 
130 


Arg 


Lys 


Asp 


Gin 


Arg 
135 


Asp 


Ala 


Cys 


He 


Lys 
140 


Glu 


Ala 


He 






etc 


ttg 


cac 


ttc 


tec 


acc 


gac 


cca 


gtg 


gee 


gee 


tec 


ate 


atg 


aag 


ate 


479 




Leu 


Leu 


His 


Phe 


Ser 


Thr 


Asp 


Pro 


Val 


Ala 


Ala 


Ser 


He 


Met 


Lys 


He 




30 




145 










150 










155 














tac 


acg 


ttc 


aac 


aaa 


gac 


cag 


gac 


egg 


gtg 


aag 


ctg 


ggt 


gtg 


gac 


acc 


527 




Tyr 


Thr 


Phe 


Asn 


Lys 


Asp 


Gin 


Asp 


Arg 


Val 


Lys 


Leu 


Gly 


Val 


Asp 


Thr 






160 










165 










170 










175 






att 


gec 


aag 


tac 


ctg 


gac 


aac 


ate 


cac 


ctg 


cac 


ccc 


gag 


gag 


gag 


aag 


575 


35 


lie 


Ala 


Lys 


Tyr 


Leu 
180 


Asp 


Asn 


He 


His 


Leu 
185 


His 


Pro 


Glu 


Glu 


Glu 
190 


Lys 






tac 


egg 


aag 


ate 


aag 


ctg 


cag 


aac 


aag 


gtg 


ttt 


cag 


gag 


cgc 


att 


aac 


623 




Tyr 


Arg 


Lys 


He 
195 


Lys 


Leu 


Gin 


Asn 


Lys 
200 


Val 


Phe 


Gin 


Glu 


Arg 
205 


He 


Asn 




40 


tgc 


ctg 


gaa 


ggg 


acc 


cac 


gag 


ttt 


ttt 


gag 


gee 


att 


ggg 


ttc 


cag 


aag 


671 




Cys 


Leu 


Glu 
210 


Gly 


Thr 


His 


Glu 


Phe 
215 


Phe 


Glu 


Ala 


He 


Gly 
220 


Phe 


Gin 


Lys 






gtg 


ttg 


Ctt 


ccc 


gee 


cag 


gat 


cag 


gag 


gac 


ccc 


gag 


gag 


ttc 


tac 


gtg 


719 




Val 


Leu 


Leu 


Pro 


Ala 


Gin 


Asp 


Gin 


Glu 


Asp 


Pro 


Glu 


Glu 


Phe 


Tyr 


Val 




45 




225 










230 










235 














ctg 


age 


gag 


ace 


acc 


ttg 


gee 


cag 


ccc 


cag 


age 


ctg 


gag 


agg 


cac 


aag 


767 




Leu 


Ser 


Glu 


Thr 


Thr 


Leu 


Ala 


Gin 


Pro 


Gin 


Ser 


Leu 


Glu 


Arg 


His 


Lys 






240 










245 










250 










255 






gaa 


cag 


ctg 


ctg 


get 


gcg 


gag 


ccc 


gtg 


cgc 


gec 


aag 


ctg 


gac 


agg 


cag 


815 


50 


Glu 


Gin 


Leu 


Leu 


Ala 
260 


Ala 


Glu 


Pro 


Val 


Arg 
265 


Ala 


Lys 


Leu 


Asp 


Arg 
270 


Gin 






cgc 


cgc 


gtc 


ttc 


cag 


ccc 


teg 


ccc 


ctg 


gee 


teg 


cag 


ttc 


gaa 


ctg 


cct 


863 




Arg 


Arg 


Val 


Phe 
275 


Gin 


Pro 


Ser 


Pro 


Leu 
280 


Ala 


Ser 


Gin 


Phe 


Glu 
285 


Leu 


Pro 




55 


ggg 


gac 


ttc 


ttc 


aac 


etc 


aca 


gca 


gag 


gag 


ate 


aag 


egg 


gag 


cag 


agg 


911 




Gly Asp 


Phe 


Phe 


Asn 


Leu 


Thr 


Ala 


Glu 


Glu 


He 


Lys 


Arg 


Glu 


Gin 


Arg 










290 










295 










300 












etc 


agg 


tec 


gag 


gcg 


gtg 


gag 


egg 


ctg 


age 


gtg 


ctg 


egg 


acc 


aag 


gee 


959 




Leu 


Arg 


Ser 


Glu 


Ala 


Val 


Glu 


Arg 


Leu 


Ser 


Val 


Leu 


Arg 


Thr 


Lys 


Ala 




60 




305 










310 










315 














atg 


egg 


gag 


aag 


gag 


gag 


cag 


egg 


ggg 


ctg 


cgc 


aag 


tac 


aac 


tac 


acg 


1007 




Met 


Arg 


Glu 


Lys 


Glu 


Glu 


Gin 


Arg 


Gly 


Leu 


Arg 


Lys 


Tyr 


Asn 


Tyr 


Thr 






320 










325 










330 










335 





232 
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10 



15 



20 



25 



30 



35 



ctg ctg cgc gtg cgc etc ccc gat ggc tgc etc ctg cag ggc act ttc 1055 
Leu Leu Arg Val Arg Leu Pro Asp Gly Cys Leu Leu Gin Gly Thr Phe 

340 345 350 

tac get egg gag egg ctg ggg gcg gtg tac ggg ttc gtc egg gag gee 1103 
Tyr Ala Arg Glu Arg Leu Gly Ala Val Tyr Gly Phe Val Arg Glu Ala 

355 360 365 

ctg cag age gac tgg ctg cct ttt gag ctg ctg gee teg gga ggg cag 1151 
Leu Gin Ser Asp Trp Leu Pro Phe Glu Leu Leu Ala Ser Gly Gly Gin 

370 375 380 

aag ctg tec gag gac gag aac ctg gee ttg aac gag tgc ggg ctg gtg 1199 
Lys Leu Ser Glu Asp Glu Asn Leu Ala Leu Asn Glu Cys Gly Leu Val 

385 390 395 

ccc tct gee etc ctg ace ttc teg tgg gac atg get gtg ctg gag gac 1247 
Pro Ser Ala Leu Leu Thr Phe Ser Trp Asp Met Ala Val Leu Glu Asp 
400 405 410 415 

ate aag gee gcg ggg gee gag ccg gac tec ate ctg aaa ccc gag etc 1295 
lie Lys Ala Ala Gly Ala Glu Pro Asp Ser lie Leu Lys Pro Glu Leu 

420 425 430 

ctg tea gee ate gag aag etc ttg tgaaataaaa gcagggttgg cctcagccct 1349 
Leu Ser Ala lie Glu Lys Leu Leu 

435 

gtgggtctgt ctcatgctct ccctgttcct ctccccgcca ccccagggcc tccaagccac 1409 
ctctggaaat acttggctct gccccatggg caegggaggg gcgccagccg tggagctgtg 1469 
gaattgggcc ccgtggcaga gcccccatcc cttgggggct gtggggatgc gcccaagccc 1529 
ccgagggaga ggcctgggga caccaacaaa tctaagccct ccctagctct tggtaactgt 1589 
gtcatgaagc tgccggacag acacacgtgg catctccctg ggcaggagag caggcctgca 1649 
gcatgggtcc cgttcccgtg tgccgtgggt ggcagtggct gcacctggca etagggctge 1709 
tctgtggatg tgggtgacaa eggcaggagg ggacgctggc cttcctgcac atagacctgc 1769 
agttagtaaa tcataagccc aaataaacag gttgtttgaa tataaaaaaa aaaaaaaaaa 1829 

<210> 239 

<211> 1083 

<212> DNA 

<213> Homo sapiens 



<220> 
<221> CDS 
<222> 421. 



768 



40 <400> 239 

aaggatgtgc tctttcccaa ggagagggag ctctgttgcc 
tgtgcaaacc tcttcccctc cttggcccca gtctccccaa 
taaaatgeca eggaagaace tagggatgea ccaggaacca 
ttgatttgat tcatgaccct catctggaca caagctctaa 

45 aatggctgat agagtccaca gaacacgctg tcctcatctc 
cagaggggaa ggatttacct gcagttgtat ggcaagccag 
cgcagcctaa ccagcctaaa gaaaccatgg gaggagaggc 
atg tgg gee egg ctg cct cac act cca gag cag 
Met Trp Ala Arg Leu Pro His Thr Pro Glu Gin 

50 1 5 10 

ata ggt ccc aag gaa get tea ctt cat gtg gta 
lie Gly Pro Lys Glu Ala Ser Leu His Val Val 

20 25 
agg aag atg gag ggg ctt ctg get ggc etc tct 

55 Arg Lys Met Glu Gly Leu Leu Ala Gly Leu Ser 
35 40 
tea tgc tgg ccc ttt tgg gtc cat ggg cca aag 
Ser Cys Trp Pro Phe Trp Val His Gly Pro Lys 
50 55 

60 tct gee tgt gag aca tea age tec tgg gtt gaa 
Ser Ala Cys Glu Thr Ser Ser Ser Trp Val Glu 
65 70 75 

aga gtg aca tea gtg cac agt tta tgc caa ggg 

233 



tccttcccac agaatcactc 
ttctaaaatc ggatactgga 
cgcgcctgaa tgccacaggt 
aatacttgag ccttggcaga 
agagaggaga actctgaacc 
aggtaggege tgcactggaa 
tcttaccctc tectttgeag 
atg ggc cac agg ctt 
Met Gly His Arg Leu 
15 

ccc age tgg cca gee 
Pro Ser Trp Pro Ala 
30 

tec tct cct aga aag 
Ser Ser Pro Arg Lys 
45 

gtt cat gaa ggt ggc 
Val His Glu Gly Gly 
60 

gga ctt gga tta aga 
Gly Leu Gly Leu Arg 
80 

ctt ggg gee tea gtc 



60 
120 
180 
240 
300 
360 
420 
468 



516 



564 



612 



660 



708 
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Arg Val Thr Ser Val His Ser Leu Cys Gin Gly Leu Gly Ala Ser Val 
85 90 95 

cct gga cca cca cca aca aca acc agt gat aaa aat aat 
Pro Gly Pro Pro Pro Thr Thr Thr Ser Asp Lys Asn Asn 
100 105 110 

ggc tgacatttat ggattcttcc tacacactag gctataccac 
Gly 



cag ctt ctt 
Gin Leu Leu 



tat act agt 
Tyr Thr Ser 
115 
agcgagtgcc 
10 gccacatgca 
aaagaataga 
taacaatttt 
aatgcattat 



tcgaaaggaa atatagtata gcactgtgcc gtccaacatg gcggccacta 
ctactgagca cttgaaatgt ggctagccca cattgagatg tgctgtaaat 
caccagattt ccaagactta gtaccaaaaa aagaatgtaa aatttctcat 
ttttcttaca tttattacat gttaacatga cgctatttgg agtttaaata 
taaaattcaa aaaaaaaaaa aaaaa 



15 <210> 240 

<211> 1831 

<212> DNA 

<213> Homo sapiens 

20 <220> 

<221> CDS 

<222> 78 . .590 



756 



808 



868 
928 
988 
1048 
1083 



<400> 240 

25 aaggacttaa gcgccccgga gccgggaggc gaacttggga cccgctggcc tcgctcggtg 
cgcgcctccc tccccgc atg cag ccc gcc gag cgc teg egg gtc ccc agg 

Met Gin Pro Ala Glu Arg Ser Arg Val Pro Arg 
15 10 
ate gac ccg tac gga ttc gag egg cct gag gac ttc gac gac gcc gcc 
30 lie Asp Pro Tyr Gly Phe Glu Arg Pro Glu Asp Phe Asp Asp Ala Ala 
15 20 25 

tac gag aag ttt ttc tec age tac ctg gtc acg etc acc cgc agg gcg 
Tyr Glu Lys Phe Phe Ser Ser Tyr Leu Val Thr Leu Thr Arg Arg Ala 
30 35 40 

35 ate aaa tgg tec egg ctg ctg cag ggc ggg ggc gtc ccc agg age egg 
lie Lys Trp Ser Arg Leu Leu Gin Gly Gly Gly Val Pro Arg Ser Arg 

45 50 55 

aca gtg aag cgc tat gtc egg aaa ggg gtc ccg ctg gag cac cgt gcc 
Thr Val Lys Arg Tyr Val Arg Lys Gly Val Pro Leu Glu His Arg Ala 
40 60 65 70 75 

cgc gtc tgg atg gtg ctg agt ggg gcc cag gcg cag atg gac cag aat 
Arg Val Trp Met Val Leu Ser Gly Ala Gin Ala Gin Met Asp Gin Asn 

80 85 90 

ccc ggc tac tac cac cag ctt etc cag gga gag aga aac ccc agg ctg 
45 Pro Gly Tyr Tyr His Gin Leu Leu Gin Gly Glu Arg Asn Pro Arg Leu 
95 100 105 

gag gac gcc ate agg aca gac ctg aac egg acc ttc ccc gac aac gtg 
Glu Asp Ala lie Arg Thr Asp Leu Asn Arg Thr Phe Pro Asp Asn Val 
110 115 120 

50 aag ttc egg aag acc acg gac ccc tgc tta cag agg acc ctg tac aat 
Lys Phe Arg Lys Thr Thr Asp Pro Cys Leu Gin Arg Thr Leu Tyr Asn 

125 130 135 

gtg ctg ctg gca tat ggg cac cat aac cag gga gtg ggc tac tgc cag 
Val Leu Leu Ala Tyr Gly His His Asn Gin Gly Val Gly Tyr Cys Gin 
55 140 145 150 155 

gga atg aat ttt ata gca gga tat ctg att ctt ata aca aat aat gaa 
Gly Met Asn Phe lie Ala Gly Tyr Leu lie Leu lie Thr Asn Asn Glu 

160 165 170 

taagaatctt tttggctgtt agatgetett gttggaagaa tactaccaga ttactacagc 
60 ccggccatgc tgggcctgaa gaccgaccag gaggtcctcg gggagctggt gcgggcgaag 
ctgccggctg tgggggcect gatggagcgt ctcggtgtgc tgtggacgct gctggtgtcc 
cgctggttca tctgcctgtt tgtggacatc ttgcccgtgg agacagtget teggatctgg 
gactgtttgt ttaacgaagg ctcgaagatt atettceggt tggccctgac cttaattaag 



60 
110 



158 



206 



254 



302 



350 



398 



446 



494 



542 



590 



650 
710 
770 
830 
890 



234 
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cagcaccagg agttgatttt ggaagccacc agcgttccag acatttgcga taagtttaag 950 

cagataacca aagggagttt cgtgatggag tgtcacacgt ttatgcaggt gtgtggggct 1010 

gcacgtggct cagtcccctc ccagggggcc ccgcctcacc tgcagcccgg gggctgctct 1070 

gaccacccgg aggatgcaca ggatgggcac cagtgggcat agggcacagg atgagcctcc 1130 

5 agctctgtcc tgcatctgcc ccctgcgcct ggcctccgag ggctttcctg tctatggcgg 1190 

ccctgtcttc ttggccctgg cactgcggac gctgctcctg gtcctaatgg ctgtactcat 1250 

ctgctgtgtg tggtgccaga agtgtggctt cccgaggccc ggcctcccca ctgggtcctg 1310 

gacctggcgc aggccgtata gactcaggtc ctgatgaggg cgttgtggga gctgtacctg 1370 

acaggccttc tgaggaagcc aagacgccag gagaggctca ggcctgggag tcagtagttt 143 0 

10 cctaagaggg agtggaggct cggggccact ctgggtgcag catggcaaac gtgggcggta 14 90 

tttcagcagc tgggccttca tcaaagagaa gaccatgttg gccgggcgcg gtggctcacg 1550 

cctgcagtcc cagcactttg ggaggccaag gcgtgtggat cacctgaggt caggagttca 1610 

agaccagcct ggccaacacg gtgaaacccc gtctctacta aaaaatacaa aaattagcca 1670 

ggtgtggtgg ctcacgctta tgtagtccca gttactcggg aggctgaggc acgagaatca 1730 

15 cttgaacctg ggaggcggag gttgcagtga gccgagatcg cgccactgca ctccagcctg 1790 

ggcaacagag tgagactctg tctcaaaaaa aaaaaaaaaa a 1831 



<210> 241 

<211> 1830 

20 <212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 
25 <222> 78. .608 

<400> 241 

aaggacttaa gcgccccgga gccgggaggc gaacttggga cccgctggcc tcgctcggtg 60 

cgcgcctccc tccccgc atg cag ccc gcc gag cgc teg egg gtc ccc agg 110 

30 Met Gin Pro Ala Glu Arg Ser Arg Val Pro Arg 

15 10 





ate 


gac 


ccg 


tac 


gga 


ttc 


gag 


egg 


cct 


gag 


gac 


ttc 


gac 


gac 


gcc 


gcc 


158 




He 


Asp 


Pro 


Tyr 
15 


Gly 


Phe 


Glu 


Arg 


Pro 
20 


Glu 


Asp 


Phe 


Asp 


Asp 
25 


Ala 


Ala 




35 


tac 


gag 


aag 


ttt 


ttc 


tec 


age 


tac 


ctg 


gtc 


acg 


etc 


acc 


cgc 


agg 


gcg 


206 




Tyr 


Glu 


Lys 
30 


Phe 


Phe 


Ser 


Ser 


Tyr 

35 


Leu 


Val 


Thr 


Leu 


Thr 
40 


Arg 


Arg 


Ala 






ate 


aaa 


tgg 


tec 


egg 


ctg 


ctg 


cag 


ggc 


ggg 


ggc 


gtc 


ccc 


agg 


age 


egg 


254 




He 


Lys 


Trp 


Ser 


Arg 


Leu 


Leu 


Gin 


Gly 


Gly 


Gly 


Val 


Pro 


Arg 


Ser 


Arg 




40 




45 










50 










55 














aca 


gtg 


aag 


cgc 


tat 


gtc 


egg 


aaa 


ggg 


gtc 


ccg 


ctg 


gag 


cac 


cgt 


gcc 


302 




Thr 


Val 


Lys 


Arg 


Tyr 


Val 


Arg 


Lys 


Gly 


Val 


Pro 


Leu 


Glu 


His 


Arg 


Ala 






60 










65 










70 










75 






cgc 


gtc 


tgg 


atg 


gtg 


ctg 


agt 


ggg 


gcc 


cag 


gcg 


cag 


atg 


gac 


cag 


aat 


350 


45 


Arg 


Val 


Trp 


Met 


Val 
80 


Leu 


Ser 


Gly 


Ala 


Gin 
85 


Ala 


Gin 


Met 


Asp 


Gin 
90 


Asn 






ccc 


ggc 


tac 


tac 


cac 


cag 


ctt 


etc 


cag 


gga 


gag 


aga 


aac 


ccc 


agg 


ctg 


398 




Pro 


Gly 


Tyr 


Tyr 
95 


His 


Gin 


Leu 


Leu 


Gin 
100 


Gly 


Glu 


Arg 


Asn 


Pro 
105 


Arg 


Leu 




50 


gag 


gac 


gcc 


ate 


agg 


aca 


gac 


ctg 


aac 


egg 


acc 


ttc 


ccc 


gac 


aac 


gtg 


446 




Glu 


Asp 


Ala 
110 


He 


Arg 


Thr 


Asp 


Leu 
115 


Asn 


Arg 


Thr 


Phe 


Pro 
120 


Asp 


Asn 


Val 






aag 


ttc 


egg 


aag 


acc 


acg 


gac 


ccc 


tgc 


tta 


cag 


agg 


acc 


ctg 


tac 


aat 


4 94 




Lys 


Phe 


Arg 


Lys 


Thr 


Thr 


Asp 


Pro 


Cys 


Leu 


Gin 


Arg 


Thr 


Leu 


Tyr 


Asn 




55 




125 










130 










135 














gtg 


ctg 


ctg 


gca 


tat 


ggg 


cac 


cat 


aac 


cag 


gga 


gtg 


ggc 


tac 


tgc 


cag 


542 




Val 


Leu 


Leu 


Ala 


Tyr 


Gly 


His 


His 


Asn 


Gin 


Gly 


Val 


Gly 


Tyr 


Cys 


Gin 






140 










145 










150 










155 






gga 


atg 


aat 


ttt 


ata 


gca 


gga 


tat 


ctg 


att 


ctt 


ata 


aca 


aat 


aat 


gat 


590 


60 


Gly 


Met 


Asn 


Phe 


He 
160 


Ala 


Gly 


Tyr 


Leu 


He 
165 


Leu 


He 


Thr 


Asn 


Asn 
170 


Asp 






aag 


aat 


ctt 


ttt 


ggc 


tgt 


tagatgetet tgttggaaga atactaccag 




638 




Lys 


Asn 


Leu 


Phe 


Gly 


Cys 

























235 
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attactacag 
tgcgggcgaa 
tgctggtgtc 
5 ttcggatctg 
ccttaattaa 
ataagtttaa 
tgtgtggggc 
ggggctgctc 

10 gatgagcctc 
gtctatggcg 
gctgtactca 
actgggtcct 
agctgtacct 

15 gtcagtagtt 
cgtgggcggt 
ggtggctcac 
tcaggagttc 
aaaattagcc 

20 cacgagaatc 
actccagcct 



175 
cccgg.ccatg 
gctgccggct 
ccgctggttc 
ggactgtttg 
gcagcaccag 
gcagataacc 
tgcacgtggc 
tgaccacccg 
cagctctgtc 
gccctgtctt 
tctgctgtgt 
ggacctggcg 
gacaggcctt 
tcctaagagg 
atttcagcag 
gcctgcagtc 
aagaccagcc 
aggtgtggtg 
acttgaacct 
gggcaacaga 



ctgggcctga 
gtgggggccc 
atctgcctgt 
tttaacgaag 
gagttgattt 
aaagggagtt 
tcagtcccct 
gaggatgcac 
ctgcatctgc 
cttggccctg 
gtggtgccag 
caggccgtat 
ctgaggaagc 
gagtggaggc 
ctgggccttc 
ccagcacttt 
tggccaacac 
gctcacgctt 
gggaggcgga 
gtgagactct 



agaccgacca 
tgatggagcg 
ttgtggacat 
gctcgaagat 
tggaagccac 
tcgtgatgga 
cccagggggc 
aggatgggca 
cccctgcgcc 
gcactgcgga 
aagtgtggct 
agactcaggt 
caagacgcca 
tcggggccac 
atcaaagaga 
gggaggccaa 
ggtgaaaccc 
atgtagtccc 
ggttgcagtg 
gtctcaaaaa 



ggaggtcctc 
tctcggtgtg 
cttgcccgtg 
tatcttccgg 
cagcgttcca 
gtgtcacacg 
cccgcctcac 
ccagtgggca 
tggcctccga 
cgctgctcct 
tcccgaggcc 
cctgatgagg 
ggagaggctc 
tctgggtgca 
agaccatgtt 
ggcgtgtgga 
cgtctctact 
agttactcgg 
agccgagatc 
aaaaaaaaaa 



ggggagctgg 

ctgtggacgc 
gagacagtgc 
ttggccctga 
gacatttgcg 
tttatgcagg 
ctgcagcccg 
tagggcacag 
gggctttcct 
ggtcctaatg 
cggcctcccc 
gcgttgtggg 
aggcctggga 
gcatggcaaa 
ggccgggcgc 
tcacctgagg 
aaaaaataca 
gaggctgagg 
gcgccactgc 
aa 



698 
758 
818 
878 
938 
998 
1058 
1118 
1178 
1238 
1298 
1358 
1418 
1478 
1538 
1598 
1658 
1718 
1778 
1830 



<210> 242 

<211> 508 

25 <212> PRT 

<213> Homo sapiens 



<220> 

<221> SIGNAL 



30 


<222> -27 . . - 


■1 






<400> 242 








Met 


Asp 


Pro 


Lys 


Leu 








-25 






35 


Leu 


Leu 


Leu 


Leu 


Leu 






-10 










Pro 


Ala 


Leu 


Leu 


Glu 












10 




Glu 


Phe 


Val 


Gin 


Thr 


40 








25 






Val 


Gin 


Pro 


Val 


Pro 








40 








Val 


Ala 


Ala 


Asp 


Thr 






55 








45 


Asp 


Met 


Gly 


Pro 


Gin 




70 












Pro 


Val 


He 


Leu 


Ala 












90 




Cys 


Phe 


Tyr 


Gly 


His 


50 








105 






Trp 


Leu 


Thr 


Asp 


Pro 








120 








Gly 


Arg 


Gly 


Ala 


Thr 






135 








55 


Ala 


Val 


Ser 


Ala 


Phe 




150 












Lys 


Phe 


He 


He 


Glu 












170 




Glu 


Leu 


Val 


Glu 


Lys 


60 








185 






He 


Val 


He 


Ser 


Asp 








200 








Thr 


Tyr 


Gly 


Thr 


Arg 



Gly Arg 

Glu Arg 

-5 

Lys Val 

Leu Lys 

Arg Phe 

Leu Gin 
60 

Gin Leu 
75 

Glu Leu 

Leu Asp 

Tyr Val 

Asp Asn 
140 
Arg Ala 
155 

Gly Met 
Glu Lys 
Asn Leu 
Gly Asn 



Met Ala 
-20 

Gly Met 

Phe Gin 

Glu Trp 
30 

Arg Gin 
45 

Arg Leu 

Pro Asp 

Gly Ser 

Val Gin 
110 
Leu Thr 
125 

Lys Gly 

Leu Glu 

Glu Glu 

Asp Arg 
190 
Trp He 
205 

Ser Tyr 



Ala Ser 

Phe Ser 

Tyr He 
15 

Val Ala 

Glu Leu 

Gly Ala 

Gly Gin 

80 
Asp Pro 
95 

Pro Ala 

Glu Val 

Pro Val 

Gin Asp 
160 
Ala Gly 
175 

Phe Phe 

Ser Gin 

Phe Met 
236 



Leu Leu 
-15 
Ser Pro 
1 

Asp Leu 

He Glu 

Phe Arg 

50 
Arg Val 
65 

Ser Leu 

Thr Lys 

Asp Arg 

Asp Gly 
130 
Leu Ala 
145 

Leu Pro 

Ser Val 

Ser Gly 

Arg Lys 
210 
Val Glu 



Ala Val Leu 

Ser Pro Pro 
5 

His Gin Asp 
20 

Ser Asp Ser 
35 

Met Met Ala 

Ala Ser Val 

Pro He Pro 
85 

Gly Thr val 
100 

Gly Asp Gly 
115 

Lys Leu Tyr 

Trp He Asn 

Val Asn He 
165 

Ala Leu Glu 
180 

Val Asp Tyr 
195 

Pro Ala He 

Val Lys Cys 
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25 



35 



40 



50 



60 





215 










220 










225 










Arg 


Asp 


Gin 


Asp 


Phe 


His 


Ser 


Gly 


Thr 


Phe 


Gly 


Gly 


He 


Leu 


His 


Glu 


230 










235 










240 










245 


Pro 


Met 


Ala 


Asp 


Leu 


Val 


Ala 


Leu 


Leu 


Gly 


Ser 


Leu 


Val 


Asp 


Ser 


Ser 










250 










255 










260 




Gly 


His 


He 


Leu 


Val 


Pro 


Gly 


He 


Tyr 


Asp 


Glu 


Val 


Val 


Pro 


Leu 


Thr 








265 










270 










275 






Glu 


Glu 


Glu 


He 


Asn 


Thr 


Tyr 


Lys 


Ala 


He 


His 


Leu 


Asp 


Leu 


Glu 


Glu 






280 










285 










290 








Tyr 


Arg 


Asn 


Ser 


Ser 


Arg 


Val 


Glu 


Lys 


Phe 


Leu 


Phe 


Asp 


Thr 


Lys 


Glu 




295 










300 










305 










Glu 


lie 


Leu 


Met 


His 


Leu 


Trp 


Arg 


Tyr 


Pro 


Ser 


Leu 


Ser 


He 


His 


Gly 


310 










315 










320 










325 


lie 


Glu 


Gly 


Ala 


Phe 


Asp 


Glu 


Pro 


Gly 


Thr 


Lys 


Thr 


Val 


He 


Pro 


Gly 










330 










335 










340 




Arg 


Val 


He 


Gly 


Lys 


Phe 


Ser 


He 


Arg 


Leu 


Val 


Pro 


His 


Met 


Asn 


Val 








345 










350 
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Ser 


Ala 


Val 


Glu 


Lys 


Gin 


Val 


Thr 


Arg 


His 


Leu 


Glu 


Asp 


Val 


Phe 


Ser 






360 










365 










370 








Lys 


Arg 


Asn 


Ser 


Ser 


Asn 


Lys 


Met 


Val 


Val 


Ser 


Met 


Thr 


Leu 


Gly 


Leu 




375 










380 
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Pro 


Trp 


He 


Ala 
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He 


Asp 


Asp 


Thr 


Gin 


Tyr 


Leu 


Ala 


Ala 


Lys 


390 
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He 


Arg 


Thr 


Val 
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Gly 


Thr 


Glu 


Pro 


Asp 


Met 


He 


Arg 


Asp 
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Gly 


Ser 


Thr 


He 


Pro 


He 


Ala 


Lys 


Met 


Phe 


Gin 


Glu 


He 


Val 


His 


Lys 
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Val 


Val 
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He 


Pro 


Leu 


Gly 
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Val 


Asp 


Asp 


Gly 


Glu 


His 


Ser 
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Arg 
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He 


Glu 


Gly 


Thr 


Lys 


Leu 




455 










460 










465 










Phe 


Ala 


Ala 
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Glu 


Met 
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His 
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<221> SIGNAL 
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<400> 243 




























Met 
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Gly 
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Val 
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Leu 


Leu 


Val 


Thr 


Arg 


Ser 


Pro 


Val 
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Ala 


Cys 
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1 


Leu 


Leu 


Thr 


Gly 


Ser 


Leu 


Phe 


Val 


Leu 


Leu 


Arg 


Val 


Phe 


Ser 


Phe 


Glu 








5 










10 










15 






Pro 


Val 


Pro 


Ser 


Cys 


Arg 


Ala 


Leu 


Gin 


Val 


Leu 


Lys 


Pro 


Arg 


Asp 


Arg 






20 










25 










30 








lie 


Ser 


Ala 


He 


Ala 


His 


Arg 


Gly 


Gly 


Ser 


His 


Asp 


Ala 


Pro 


Glu 


Asn 




35 










40 










45 










Thr 


Leu 


Ala 


Ala 


He 


Arg 


Gin 


Ala 


Ala 


Lys 


Asn 


Gly 


Ala 


Thr 


Gly 


Val 


50 










55 










60 
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<210> 258 

35 <211> 200 

<212> PRT 

<213> Homo sapiens 

<220> 
40 <221> SIGNAL 
<222> -20 . . -1 

<400> 258 

Met Asp Ser Ser Thr Ala His Ser Pro Val Phe Leu Val Phe Pro Pro 
45 -20 -15 -10 -5 

Glu He Thr Ala Ser Glu Tyr Glu Ser Thr Glu Leu Ser Ala Thr Thr 

15 10 
Phe Ser Thr Gin Ser Pro Leu Gin Lys Leu Phe Ala Arg Lys Met Lys 
15 20 25 

50 He Leu Gly Thr He Gin He Leu Phe Gly He Met Thr Phe Ser Phe 
30 35 40 

Gly Val He Phe Leu Phe Thr Leu Leu Lys Pro Tyr Pro Arg Phe Pro 
45 50 55 60 

Phe He Phe Leu Ser Gly Tyr Pro Phe Trp Gly Ser Val Leu Phe He 
55 65 70 75 

Asn Ser Gly Ala Phe Leu He Ala Val Lys Arg Lys Thr Thr Glu Thr 

80 85 90 

Leu He He Leu Ser Arg He Met Asn Phe Leu Ser Ala Leu Gly Ala 
95 100 105 

60 He Ala Gly He He Leu Leu Thr Phe Gly Phe He Leu Asp Gin Asn 
110 115 120 

Tyr He Cys Gly Tyr Ser His Gin Asn Ser Gin Cys Lys Ala Val Thr 
125 130 135 140 
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Val Leu His His Phe Asp Ser Ser Ser Gin Glu Ser Val Pro Lys Arg 
330 335 340 345 

Arg Lys Phe Ser Glu Pro Lys Glu His lie 
350 355 

5 

<210> 260 
<211> 158 
<212> PRT 
<213> Homo sapiens 

10 

<220> 

<221> SIGNAL 
<222> -17 . . -1 

15 <400> 260 

Met Ala Leu Glu Val Leu Met Leu Leu Ala Val Leu lie Trp Thr Gly 

-15 -10 -5 

Ala Glu Asn Leu His Val Lys lie Ser Cys Ser Leu Asp Trp Leu Met 
1 5 10 15 

20 Val Ser Val lie Pro Val Ala Glu Ser Arg Asn Leu Tyr lie Phe Ala 

20 25 30 

Asp Glu Leu His Leu Gly Met Gly Cys Pro Ala Asn Arg lie His Thr 

35 40 45 

Tyr Val Tyr Glu Phe lie Tyr Leu Val Arg Asp Cys Gly He Arg Thr 
25 50 55 60 

Arg Val Val Ser Glu Glu Thr Leu Leu Phe Gin Thr Glu Leu Tyr Phe 

65 70 75 

Thr Pro Arg Asn He Asp His Asp Pro Gin Glu He His Leu Glu Cys 
80 85 90 95 

30 Ser Thr Ser Arg Lys Ser Val Trp Leu Thr Pro Val Ser Thr Glu Asn 

100 105 110 

Glu He Lys Leu Asp Pro Ser Pro Phe He Ala Asp Phe Gin Thr Thr 

115 120 125 

Ala Glu Glu Leu Gly Leu Leu Ser Ser Ser Pro Asn Leu Leu 
35 130 135 140 

<210> 261 
<211> 233 
<212> PRT 
40 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -32 . . -1 

45 

<400> 261 

Met Ala Thr Pro Pro Phe Arg Leu He Arg Lys Met Phe Ser Phe Lys 

-30 -25 -20 

Val Ser Arg Trp Met Gly Leu Ala Cys Phe Arg Ser Leu Ala Ala Ser 
50 -15 * -10 -5 

Ser Pro Ser He Arg Gin Lys Lys Leu Met His Lys Leu Gin Glu Glu 
1 5 10 15 

Lys Ala Phe Arg Glu Glu Met Lys He Phe Arg Glu Lys He Glu Asp 
20 25 30 

55 Phe Arg Glu Glu Met Trp Thr Phe Arg Gly Lys He His Ala Phe Arg 
35 40 45 

Gly Gin He Leu Gly Phe Trp Glu Glu Glu Arg Pro Phe Trp Glu Glu 

50 55 60 

Glu Lys Thr Phe Trp Lys Glu Glu Lys Ser Phe Trp Glu Met Glu Lys 
60 65 70 75 80 

Ser Phe Arg Glu Glu Glu Lys Thr Phe Trp Lys Lys Tyr Arg Thr Phe 

85 90 95 

Trp Lys Glu Asp Lys Ala Phe Trp Lys Glu Asp Asn Ala Leu Trp Glu 
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<220> 

<221> SIGNAL 
<222> -19 . . -1 

5 <400> 264 

Met Phe Leu Thr Val Lys Leu Leu Leu Gly Gin Arg Cys Ser Leu Lys 

-15 -10 -5 

Val Ser Gly Gin Glu Ser Val Ala Thr Leu Lys Arg Leu Val Ser Arg 
1 5 10 

10 Arg Leu Lys Val Pro Glu Glu Gin Gin His Leu Leu Phe Arg Gly Gin 
15 2 0 25 

Leu Leu Glu Asp Asp Lys His Leu Ser Asp Tyr Cys lie Gly Pro Asn 
30 35 40 45 

Ala Ser lie Asn Val lie Met Gin Pro Leu Glu Lys Met Ala Leu Lys 
15 50 55 60 

Glu Ala His Gin Pro Gin Thr Gin Pro Leu Trp His Gin Leu Gly Leu 

65 70 75 

Val Leu Ala Lys His Phe Glu Pro Gin Asp Ala Lys Ala Val Leu Gin 
80 85 90 

20 Leu Leu Arg Gin Glu His Glu Glu Arg Leu Gin Lys lie Ser Leu Glu 
95 100 105 

His Leu Glu Gin Leu Ala Gin Tyr Leu Leu Ala Glu Glu Pro His Val 
110 115 120 125 

Glu Pro Ala Gly Glu Arg Glu Leu Glu Ala Lys Ala Arg Pro Gin Ser 
25 130 135 140 

Ser Cys Asp Met Glu Glu Lys Glu Glu Ala Ala Ala Asp Gin 
145 150 155 

<210> 265 
30 <211> 106 
<212> PRT 
<213> Homo sapiens 

<220> 
35 <221> SIGNAL 
<222> -17 . . -1 

<400> 265 

Met Ala Leu Glu Val Leu Met Leu Leu Ala Val Leu lie Trp Thr Gly 
40 -15 -10 -5 

Ala Glu Asn Leu His Val Lys lie Ser Cys Ser Leu Asp Trp Leu Met 

1 5 10 15 

Val Ser Val lie Pro Val Ala Glu Ser Arg Asn Leu Tyr lie Phe Ala 
20 25 30 

45 Asp Glu Leu His Leu Gly Met Gly Cys Pro Ala Asn Arg lie His Thr 
35 40 45 

Tyr Val Tyr Glu Phe lie Tyr Leu Val Arg Asp Cys Gly lie Arg Thr 

50 55 60 

Arg Val Arg Thr Val lie Val Cys Lys Lys Tyr Cys Met Phe Cys Gin 
50 65 70 75 

Thr Phe Met Pro Ser lie Lys lie Val Phe 
80 85 

<210> 266 
55 <211> 124 
<212> PRT 

<213> Homo sapiens 

<220> 
60 <221> SIGNAL 
<222> -18. . -1 

<400> 266 
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<211> 261 

20 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
25 <222> -16 . . -1 

<400> 267 

Met Glu Asn Phe Ser Leu Leu Ser lie Ser Gly Pro Pro lie Ser Ser 
-15 -10 -5 

30 Ser Ala Leu Ser Ala Phe Pro Asp lie Met Phe Ser Arg Ala Thr Ser 
15 10 15 

Leu Pro Asp lie Ala Lys Thr Ala Val Pro Thr Glu Ala Ser Ser Pro 

20 25 30 

Ala Gin Ala Leu Pro Pro Gin Tyr Gin Ser lie lie Val Arg Gin Gly 
35 35 40 . 45 

lie Gin Asn Thr Val Leu Ser Pro Asp Cys Ser Leu Gly Asp Thr Gin 

50 55 60 

His Gly Glu Lys Leu Arg Arg Asn Cys Thr lie Tyr Arg Pro Trp Phe 
65 70 75 80 

40 Ser Pro Tyr Ser Tyr Phe Val Cys Ala Asp Lys Glu Ser Gin Leu Glu 

85 90 95 

Ala Tyr Asp Phe Pro Glu Val Gin Gin Asp Glu Gly Lys Trp Asp Asn 

100 105 110 

Cys Leu Ser Glu Asp Met Ala Glu Asn lie Cys Ser Ser Ser Ser Ser 
45 115 120 125 

Pro Glu Asn Thr Cys Pro Arg Glu Ala Thr Lys Lys Ser Arg His Gly 

130 135 140 

Leu Asp Ser lie Thr Ser Gin Asp lie Leu Met Ala Ser Arg Trp His 
145 150 155 160 

50 Pro Ala Gin Gin Asn Gly Tyr Lys Cys Val Ala Cys Cys Arg Met Tyr 

165 170 175 

Pro Thr Leu Asp Phe Leu Lys Ser His lie Lys Arg Gly Phe Arg Glu 

180 185 190 

Gly Phe Ser Cys Lys Val Tyr Tyr Arg Lys Leu Lys Ala Leu Trp Ser 
55 195 200 205 

Lys Glu Gin Lys Ala Arg Leu Gly Asp Arg Leu Ser Ser Gly Ser Cys 

210 215 220 

Gin Ala Phe Asn Ser Pro Ala Glu His Leu Arg Gin lie Gly Gly Glu 
225 230 235 240 

60 Ala Tyr Leu Cys Leu 

245 

<210> 268 
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Ser Ala Leu 

Pro Ala Val 
10 

Arg Ser He 
25 

Val Arg Ser 

Pro Ser Leu 

Ser Asp Asp 
75 

Glu Asn Asn 
90 

Lys Ser Leu 
105 

Pro Lys Asp 

Arg Gly Asn 

Trp Leu Gly 
155 

Pro Pro Glu 

170 
Phe Asp Cys 
185 

Gin Ser Leu 
Val He Ala 



He Pro Leu 
-20 

Leu Leu Thr 
-5 

Cys Thr Cys 

Pro Arg Thr 
30 

Val Phe Thr 
45 

Gin Leu Leu 
60 

Ala Phe He 

Asn He Lys 

He His Leu 
110 

He Phe Lys 
125 

Ser Phe Asn 
140 

His Thr Asn 

Tyr Lys Lys 

He He Thr 
190 

Ser He Asp 

205 
Gin Pro Phe 
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210 215 220 

Thr Gly Lys Cys lie phe Leu Glu Trp Asp His Val Glu Lys Thr Phe 

225 230 235 

Arg Asn Tyr Asp Asn lie Thr Val Leu Arg Glu lie His Arg Phe Thr 
5 240 245 250 

Asn Met Ser 
255 

<210> 310 
10 <211> 426 
<212> PRT 
<213> Homo sapiens 

<220> 
15 <221> SIGNAL 
<222> -28 . . -1 



<400> 310 





Met 


Ser 


Pro 


Ala 


Phe 


Arq 


Ala 


Met 


ASP 


Val 


Glu 


Pro 


Arg 


Ala 


Lys 


Gly 


20 








-25 










-20 










-15 








Val 


Leu 


Leu 


Glu 


Pro 


Phe 


Val 


His 


Gin 


Val 


Gly 


Gly 


His 


Ser 


Cys 


Val 








-10 










-5 










1 










Leu 


Ara 


Phe 


Asn 


Glu 


Thr 


Thr 


Leu 


Cys 


Lys 


Pro 


Leu 


Val 


Pro 


Arg 


Glu 




5 










10 










15 










20 


25 


His 


Gin 


Phe 


Tyr 


Glu 


Thr 


Leu 


Pro 


Ala 


Glu 


Met 


Arg 


Lys 


Phe 


Thr 


Pro 












25 










30 










35 






Gin 


Tyr 


Lys 


Gly 


Val 


Val 


Ser 


Val 


Arg 


Phe 


Glu 


Glu 


Asp 


Glu 


Asp 


Arg 










40 










45 










50 








Asn 


Leu 


Cys 


Leu 


He 


Ala 


Tyr 


Pro 


Leu 


Lys 


Gly 


Asp 


His 


Gly 


He 


Val 


30 






55 










60 










65 










Asp 


He 


Val 


Asp 


Asn 


Ser 


Asp 


Cys 


Glu 


Pro 


Lys 


Ser 


Lys 


Leu 


Leu 


Arg 






70 










75 










80 












Trp 


Thr 


Thr 


Asn 


Lys 


Lys 


His 


His 


Val 


Leu 


Glu 


Thr 


Glu 


Lys 


Thr 


Pro 




85 










90 










95 










100 


35 


Lys 


Asp 


Trp 


Val 


Arg 


Gin 


His 


Arg 


Lys 


Glu 


Glu 


Lys 


Met 


Lys 


Ser 


His 












105 










110 










115 






Lys 


Leu 


Glu 


Glu 


Glu 


Phe 


Glu 


Trp 


Leu 


Lys 


Lys 


Ser 


Glu 


Val 


Leu 


Tyr 










120 










125 










130 








Tyr 


Thr 


Val 


Glu 


Lys 


Lys 


Gly 


Asn 


He 


Ser 


Ser 


Gin 


Leu 


Lys 


His 


Tyr 


40 






135 










140 










145 










Asn 


Pro 


Trp 


Ser 


Met 


Lys 


Cys 


His 


Gin 


Gin 


Gin 


Leu 


Gin 


Arg 


Met 


Lys 






150 










155 










160 












Glu 


Asn 


Ala 


Lys 


His 


Arg 


Asn 


Gin 


Tyr 


Lys 


Phe 


He 


Leu 


Leu 


Glu 


Asn 




165 










170 










175 










180 


45 


Leu 


Thr 


Ser 


Arg 


Tyr 


Glu 


Val 


Pro 


Cys 


Val 


Leu 


Asp 


Leu 


Lys 


Met 


Gly 












185 










190 










195 






Thr 


Arg 


Gin 


His 


Gly 


Asp 


Asp 


Ala 


Ser 


Glu 


Glu 


Lys 


Ala 


Ala 


Asn 


Gin 










200 










205 










210 








He 


Arg 


Lys 


Cys 


Gin 


Gin 


Ser 


Thr 


Ser 


Ala 


Val 


He 


Gly 


Val 


Arg 


Val 


50 






215 










220 










225 










Cys 


Gly 


Met 


Gin 


Val 


Tyr 


Gin 


Ala 


Gly 


Ser 


Gly 


Gin 


Leu 


Met 


Phe 


Met 






230 










235 










240 












Asn 


Lys 


Tyr 


His 


Gly 


Arg 


Lys 


Leu 


Ser 


Met 


Gin 


Gly 


Phe 


Lys 


Glu 


Ala 




245 










250 










255 










260 


55 


Leu 


Phe 


Gin 


Phe 


Phe 


His 


Asn 


Gly 


Arg 


Tyr 


Leu 


Arg 


Arg 


Glu 


Leu 


Leu 












265 










270 










275 






Gly 


Pro 


Val 


Leu 


Lys 


Lys 


Leu 


Thr 


Glu 


Leu 


Lys 


Ala 


Val 


Leu 


Glu 


Arg 










280 










285 










290 








Gin 


Glu 


Ser 


Tyr 


Arg 


Phe 


Tyr 


Ser 


Ser 


Ser 


Leu 


Leu 


Val 


He 


Tyr 


Asp 


60 






295 










300 










305 










Gly 


Lys 


Glu 


Arg 


Pro 


Glu 


Val 


Val 


Leu 


Asp 


Ser 


Asp 


Ala 


Glu 


Asp 


Leu 






310 










315 










320 












Glu 


Asp 


Leu 


Ser 


Glu 


Glu 


Ser 


Ala 


Asp 


Glu 


Ser 


Ala 


Gly 


Ala 


Tyr 


Ala 
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10 



15 



325 










330 










335 










340 


Tyr 


Lys 


Pro 


lie 


Gly 


Ala 


Ser 


Ser 


Val 


Asp 


Val 


Arg 


Met 


He 


Asp 


Phe 










345 










350 










355 




Ala 


His 


Thr 


Thr 


Cys 


Arg 


Leu 


Tyr 


Gly 


Glu 


Asp 


Thr 


Val 


Val 


His 


Glu 








360 










365 










370 






Gly 


Gin 


Asp 


Ala 


Gly 


Tyr 


lie 


Phe 


Gly 


Leu 


Gin 


Ser 


Leu 


He 


Asp 


He 






375 










380 










385 








Val 


Thr 


Glu 


lie 


Ser 


Glu 


Glu 


Ser 


Gly 


Glu 















390 395 

<210> 311 
<211> 466 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SIGNAL 
<222> -16 . . -1 



20 <400> 311 



30 



40 



50 



60 



Met 


Glv 


Leu 


Tvr 


Ala 


Ala 


Ala 


Ala 


Glv 


Val 


Leu 


Ala 


Gly 


Val 


Glu 


Ser 




-15 










-10 










-5 










Aro 


Gin 


Glv 


Ser 


He 


LVS 


Gly 


Leu 


Val 


Tyr 


Ser 


Ser 


Asn 


Phe 


Gin 


Asn 


1 








5 










10 










15 




Val 


Lvs 


Gin 


Leu 


Tvr 


Ala 


Leu 


Val 


Cys 


Glu 


Thr 


Gin 


Arg 


Tyr 


Ser 


Ala 








20 










25 










30 






Val 


Leu 


Asp 


Ala 


Val 


He 


Ala 


Ser 


Ala 


Gly 


Leu 


Leu 


Arg 


Ala 


Glu 


Lys 






35 










40 










45 








Lys 

2 


Leu 


Arg 


Pro 


His 


Leu 


Ala 


Lys 


Val 


Leu 


Val 


Tyr 


Glu 


Leu 


Leu 


Leu 




50 










55 










60 










Gly 


Lys 


Gly 


Phe 


Arg 


Gly 


Gly 


Gly 


Gly 


Arg 


Trp 


Lys 


Ala 


Leu 


Leu 


Gly 


65 










70 










75 










80 


Arg 


His 


Gin 


Ala 


Arg 


Leu 


Lys 


Ala 


Glu 


Leu 


Ala 


Arg 


Leu 


Lys 


Val 


His 










85 










90 










95 




Arg 


Gly 


Val 


Ser 


Arg 


Asn 


Glu 


Asp 


Leu 


Leu 


Glu 


Val 


Gly 


Ser 


Arg 


Pro 








100 










105 










110 






Gly 


Pro 


Ala 


Ser 


Gin 


Leu 


Pro 


Arg 


Phe 


Val 


Arg 


Val 


Asn 


Thr 


Leu 


Lys 






115 










120 










125 








Thr 


Cys 


Ser 


Asp 


Asp 


Val 


Val 


Asp 


Tyr 


Phe 


Lys 


Arg 


Gin 


Gly 


Phe 


Ser 




130 










135 










140 










Tyr 


Gin 


Gly 


Arg 


Ala 


Ser 


Ser 


Leu 


Asp 


Asp 


Leu 


Arg 


Ala 


Leu 


Lys 


Gly 


145 










150 










155 










160 


Lys 


His 


Phe 


Leu 


Leu 


Asp 


Pro 


Leu 


Met 


Pro 


Glu 


Leu 


Leu 


Val 


Phe 


Pro 








165 










170 










175 




Ala 


Gin 


Thr 


Asp 


Leu 


His 


Glu 


His 


Pro 


Leu 


Tyr 


Arg 


Ala 


Gly 


His 


Leu 








180 










185 










190 






He 


Leu 


Gin 


Asp 


Arg 


Ala 


Ser 


Cys 


Leu 


Pro 


Ala 


Met 


Leu 


Leu 


Asp 


Pro 






195 










200 










205 








Pro 


Pro 


Gly 


Ser 


His 


Val 


He 


Asp 


Ala 


Cys 


Ala 


Ala 


Pro 


Gly 


Asn 


Lys 




210 










215 










220 










Thr 


Ser 


His 


Leu 


Ala 


Ala 


Leu 


Leu 


Lys 


Asn 


Gin 


Gly 


Lys 


He 


Phe 


Ala 


225 










230 










235 










240 


Phe 


Asp 


Leu 


Asp 


Ala 


Lys 


Arg 


Leu 


Ala 


Ser 


Met 


Ala 


Thr 


Leu 


Leu 


Ala 










245 










250 










255 




Arg 


Ala 


Gly 


Val 


Ser 


Cys 


Cys 


Glu 


Leu 


Ala 


Glu 


Glu 


Asp 


Phe 


Leu 


Ala 








260 










265 










270 






Val 


Ser 


Pro 


Ser 


Asp 


Pro 


Arg 


Tyr 


His 


Glu 


Val 


His 


Tyr 


He 


Leu 


Leu 






275 










280 










285 








Asp 


Pro 


Ser 


Cys 


Ser 


Gly 


Ser 


Gly 


Met 


Pro 


Ser 


Arg 


Gin 


Leu 


Glu 


Glu 




290 










295 










300 










Pro 


Gly 


Ala 


Gly 


Thr 


Pro 


Ser 


Pro 


Val 


Arg 


Leu 


His 


Ala 


Leu 


Ala 


Gly 


305 










310 










315 










320 


Phe 


Gin 


Gin 


Arg 


Ala 


Leu 


Cys 


His 


Ala 


Leu 


Thr 


Phe 


Pro 


Ser 


Leu 


Gin 
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325 

Arg Leu Val Tyr Ser 
340 

Val Val Arg Asp Ala 
5 355 

Pro Ala Leu Pro Ala 
370 

Ala Glu His Cys Leu 
385 

10 Phe Phe Val Ala Val 

405 

Gin Ala Lys Ala Ser 
420 

Arg Lys Lys Arg Gin 
15 435 
Cys Thr 
450 



330 

Thr Cys Ser Leu Cys Gin 
345 

Leu Gin Gin Asn Pro Gly 
360 

Trp Pro His Arg Gly Leu 
375 

Arg Ala Ser Pro Glu Thr 
390 395 
lie Glu Arg Val Glu Val 
410 

Ala Pro Glu Arg Thr Pro 
425 

Gin Arg Ala Ala Ala Gly 
440 



PCT/IB00/01938 
335 

Glu Glu Asn Glu Asp 
350 

Ala Phe Arg Leu Ala 
365 

Ser Thr Phe Pro Gly 
380 

Thr Leu Ser Ser Gly 
400 

Pro Ser Ser Ala Ser 
415 

Ser Pro Ala Pro Lys 
430 

Ala Cys Thr Pro Pro 
445 



<210> 312 

20 <211> 382 

<212> PRT 

<213> Homo sapiens 

<220> 
25 <221> SIGNAL 
<222> -16 . . -1 



<400> 312 



30 



Met 


Gly 


Leu 


Tyr 


Ala 


Ala 


Ala 


Ala 


Gly 


Val 


Leu 


Ala 


Gly 


Val 


Glu 


Ser 




-15 










-10 










-5 










Arg 


Gin 


Gly 


Ser 


He 


Lys 


Gly 


Leu 


Val 


Tyr 


Ser 


Ser 


Asn 


Phe 


Gin 


Asn 


1 








5 










10 










15 




Val 


Lys 


Gin 


Leu 


Tyr 


Ala 


Leu 


Val 


Cys 


Glu 


Thr 


Gin 


Arg 


Tyr 


Ser 


Ala 








20 










25 










30 






Val 


Leu 


Asp 


Ala 


Val 


He 


Ala 


Ser 


Ala 


Gly 


Leu 


Leu 


Arg 


Ala 


Glu 


Lys 






35 










40 










45 








Lys 


Leu 


Arg 


Pro 


His 


Leu 


Ala 


Lys 


Val 


Leu 


Val 


Tyr 


Glu 


Leu 


Leu 


Leu 




50 










55 










60 










Gly 


Lys 


Gly 


Phe 


Arg 


Gly 


Gly 


Gly 


Gly 


Arg 


Trp 


Lys 


Ala 


Leu 


Leu 


Gly 


65 










70 










75 










80 


Arg 


His 


Gin 


Ala 


Arg 


Leu 


Lys 


Ala 


Glu 


Leu 


Ala 


Arg 


Leu 


Lys 


Val 


His 










85 










90 










95 




Arg 


Gly 


Val 


Ser 


Arg 


Asn 


Glu 


Asp 


Leu 


Leu 


Glu 


Val 


Gly 


Ser 


Arg 


Pro 








100 










105 










110 






Gly 


Pro 


Ala 


Ser 


Gin 


Leu 


Pro 


Arg 


Phe 


Val 


Arg 


Val 


Asn 


Thr 


Leu 


Lys 






115 










120 










125 








Thr 


Cys 


Ser 


Asp 


Asp 


Val 


Val 


Asp 


Tyr 


Phe 


Lys 


Arg 


Gin 


Gly 


Phe 


Ser 




130 










135 










140 










Tyr 


Gin 


Gly 


Arg 


Ala 


Ser 


Ser 


Leu 


Asp 


Asp 


Leu 


Arg 


Ala 


Leu 


Lys 


Gly 


145 










150 










155 










160 


Lys 


His 


Phe 


Leu 


Leu 


Asp 


Pro 


Leu 


Met 


Pro 


Glu 


Leu 


Leu 


Val 


Phe 


Pro 










165 










170 










175 




Ala 


Gin 


Thr 


Asp 


Leu 


His 


Glu 


His 


Pro 


Leu 


Tyr 


Arg 


Ala 


Gly 


His 


Leu 








180 










185 










190 






He 


Leu 


Gin 


Asp 


Arg 


Ala 


Ser 


Cys 


Leu 


Pro 


Ala 


Met 


Leu 


Leu 


Asp 


Pro 






195 










200 










205 








Pro 


Pro 


Gly 


Ser 


His 


Val 


He 


Asp 


Ala 


Cys 


Ala 


Ala 


Pro 


Gly 


Asn 


Lys 




210 










215 










220 










Thr 


Ser 


His 


Leu 


Ala 


Ala 


Leu 


Leu 


Lys 


Asn 


Gin 


Gly 


Lys 


He 


Phe 


Ala 


225 










230 










235 










240 


Phe 


Asp 


Leu 


Asp 


Ala 


Lys 


Arg 


Leu 


Ala 


Ser 


Met 


Ala 


Thr 


Leu 


Leu 


Ala 










245 










250 










255 




Arg 


Ala 


Gly 


Val 


Ser 


Cys 


Cys 


Glu 


Leu 


Ala 


Glu 


Glu 


Asp 


Phe 


Leu 


Ala 
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260 










265 










270 








Val 


Ser 


Pro 
275 


Ser 


Asp 


Pro 


Arg 


Tyr 
280 


His 


Glu 


Val 


His 


Tyr 
285 


lie 


Leu 


Leu 




Asp 


Pro 


Ser 


Cys 


Ser 


Gly 


Ser 


Gly 


Met 


Pro 


Ser 


Arg 


Gin 


Leu 


Glu 


Glu 


5 


290 










295 










300 












Pro 


Gly 


Ala 


Gly 


Thr 


Pro 


Ser 


Pro 


Val 


Arg 


Leu 


His 


Ala 


Leu 


Ala 


Ala 




305 










310 










315 










320 




Ser 


Ser 


Ser 


Glu 


Pro 
325 


Cys 


Ala 


Thr 


Arg 


Ser 
330 


Leu 


Ser 


Leu 


Pro 


Cys 
335 


Ser 


10 


Gly 


Ser 


Ser 


Thr 


Pro 


Arg 


Ala 


Pro 


Ser 


Ala 


Arg 


Arg 


Arg 


Met 


Lys 


Thr 








340 










345 










350 








Trp 


Cys 


Glu 
355 


Met 


Arg 


Cys 


Ser 


Arg 
360 


Thr 


Arg 


Ala 


Pro 


Ser 
365 


Gly 







15 <210> 313 
<211> 258 
<212> PRT 

<213> Homo sapiens 

20 <220> 

<221> SIGNAL 
<222> -36 . . -1 

<400> 313 

25 Met Glu Glu Leu Gin Glu Pro Leu Arg Gly Glu Leu Arg Leu Cys Phe 
-35 -30 -25 

Thr Gin Ala Ala Arg Thr Ser Leu Leu Leu Leu Arg Leu Asn Asp Ala 
-20 -15 -10 -5 

Ala Leu Arg Ala Leu Gin Glu Cys Gin Arg Gin Gin Val Arg Pro Val 
30 15 10 

lie Ala Phe Gin Gly His Arg Gly Tyr Leu Arg Leu Pro Gly Pro Gly 

15 20 25 

Trp Ser Cys Leu Phe Ser Phe lie Val Ser Gin Cys Cys Gin Glu Gly 
30 35 40 

35 Ala Gly Gly Ser Leu Asp Leu Val Cys Gin Arg Phe Leu Arg Ser Gly 
45 50 55 60 

Pro Asn Ser Leu His Cys Leu Gly Ser Leu Arg Glu Arg Leu lie lie 

65 70 75 

Trp Ala Ala Met Asp Ser lie Pro Ala Pro Ser Ser Val Gin Gly His 
40 80 85 90 

Asn Leu Thr Glu Asp Ala Arg His Pro Glu Ser Trp Gin Asn Thr Gly 

95 100 105 

Gly Tyr Ser Glu Gly Asp Ala Val Ser Gin Pro Gin Met Ala Leu Glu 
110 115 120 

45 Glu Val Ser Val Ser Asp Pro Leu Ala Ser Asn Gin Gly Gin Ser Leu 
125 130 135 140 

Pro Gly Ser Ser Arg Glu His Met Ala Gin Trp Glu Val Arg Ser Gin 

145 150 155 

Thr His Val Pro Asn Arg Glu Pro Val Gin Ala Leu Pro Ser Ser Ala 
50 160 165 170 

Ser Arg Lys Arg Leu Asp Lys Lys Arg Ser Val Pro Val Ala Thr Val 

175 180 185 

Glu Leu Glu Glu Lys Arg Phe Arg Thr Leu Pro Leu Val Pro Pro Pro 
190 195 200 

55 Thr Arg Pro Asp Gin Ser Gly Phe Thr Arg Gly Arg Arg Leu Gly Ala 
205 210 215 220 

Arg Arg 

<210> 314 
60 <211> 280 
<212> PRT 

<213> Homo sapiens 
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<220> 

<221> SIGNAL 
<222> -33 . . -1 

5 <400> 314 

Met Lys Ser Cys Gly Ser Met Leu Gly Leu Trp Gly Gin Arg Leu Pro 

-30 -25 -20 

Ala Ala Trp Val Leu Leu Leu Leu Pro Phe Leu Pro Leu Leu Leu Leu 
-15 -10 -5 

10 Ala Ala Pro Ala Pro His Arg Ala Ser Tyr Lys Pro Val lie Val Val 
15 10 15 

His Gly Leu Phe Asp Ser Ser Tyr Ser Phe Arg His Leu Leu Glu Tyr 

20 25 30 

lie Asn Glu Thr His Pro Gly Thr Val Val Thr Val Leu Asp Leu Phe 
15 35 40 45 

Asp Gly Arg Glu Ser Leu Arg Pro Leu Trp Glu Gin Val Gin Gly Phe 

50 55 60 

Arg Glu Ala Val Val Pro lie Met Ala Lys Ala Pro Gin Gly Val His 
65 70 75 

20 Leu lie Cys Tyr Ser Gin Gly Gly Leu Val Cys Arg Ala Leu Leu Ser 
80 85 90 95 

Val Met Asp Asp His Asn Val Asp Ser Phe lie Ser Leu Ser Ser Pro 

100 105 110 

Gin Met Gly Gin Tyr Gly Asp Thr Asp Tyr Leu Lys Trp Leu Phe Pro 
25 115 120 125 

Thr Ser Met Arg Ser Asn Leu Tyr Arg lie Cys Tyr Ser Pro Leu lie 

130 135 140 

Asn Gly Glu Arg Asp His Pro Asn Ala Thr Val Trp Arg Lys Asn Phe 
145 150 155 

30 Leu Arg Val Gly His Leu Val Leu lie Gly Gly Pro Asp Asp Gly Val 
160 165 170 175 

lie Thr Pro Trp Gin Ser Ser Phe Phe Gly Phe Tyr Asp Ala Asn Glu 

180 185 190 

Thr Val Leu Glu Met Glu Glu Gin Leu Val Tyr Leu Arg Asp Ser Phe 
35 195 200 205 

Gly Leu Lys Thr Leu Leu Ala Arg Gly Ala lie Val Arg Cys Pro Met 

210 215 220 

Ala Gly lie Ser His Thr Ala Trp His Ser Asn Arg Thr Leu Tyr Glu 
225 230 235 

40 Thr Cys lie Glu Pro Trp Leu Ser 
240 245 

<210> 315 
<211> 174 
45 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
50 <222> -33 . . -1 

<400> 315 

Met Lys Ser Cys Gly Ser Met Leu Gly Leu Trp Gly Gin Arg Leu Pro 

-30 -25 -20 

55 Ala Ala Trp Val Leu Leu Leu Leu Pro Phe Leu Pro Leu Leu Leu Leu 
-15 -10 -5 

Ala Ala Pro Ala Pro His Arg Ala Ser Tyr Lys Pro Val lie Val Val 

15 10 15 

His Gly Leu Phe Asp Ser Ser Tyr Ser Phe Arg His Leu Leu Glu Tyr 
60 20 25 30 

lie Asn Glu Thr His Pro Gly Thr Val Val Thr Val Leu Asp Leu Phe 

35 40 45 

Asp Gly Arg Glu Ser Leu Arg Pro Leu Trp Glu Gin Val Gin Gly Phe 
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50 55 60 

Arg Glu Ala Val Val Pro lie Met Ala Lys Ala Pro Gin Gly Val His 

65 70 75 

Leu lie Cys Tyr Ser Gin Gly Gly Leu Val Cys Arg Ala Leu Leu Ser 
5 80 85 90 95 

Val Met Asp Asp His Asn Val Asp Ser Phe lie Ser Leu Ser Ser Pro 

100 105 110 

Gin Met Gly Gin Tyr Gly Asp Thr Asp Tyr Leu Lys Trp Leu Phe Pro 
115 120 125 

10 Thr Ser Met Arg Ser Asn Leu Tyr Arg lie Cys Tyr Ser Pro 
130 135 140 

<210> 316 

<211> 160 

15 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
20 <222> -17 . . -1 

<400> 316 

Met Ala Phe Thr Phe Ala Ala Phe Cys Tyr Met Leu Ser Leu Val Leu 
-15 -10 -5 

25 Cys Ala Ala Leu lie Phe Phe Ala lie Trp His lie lie Ala Phe Asp 
1 5 10 15 

Glu Leu Arg Thr Asp Phe Lys Ser Pro lie Asp Gin Cys Asn Pro Val 

20 25 30 

His Ala Arg Glu Arg Leu Arg Asn lie Glu Arg lie Cys Phe Leu Leu 

30 35 40 45 

Arg Lys Leu Val Leu Pro Glu Tyr Ser lie His Ser Leu Phe Cys lie 

50 55 60 

Met Phe Leu Cys Ala Gin Glu Trp Leu Thr Leu Gly Leu Asn Val Pro 
65 70 75 

35 Leu Leu Phe Tyr His Phe Trp Arg Tyr Phe His Cys Pro Ala Asp Ser 
80 85 90 95 

Ser Glu Leu Ala Tyr Asp Pro Pro Val Val Met Asn Pro Asp Thr Leu 

100 105 110 
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Asp Tyr Leu Cys Leu Ser He Leu 
-15 

Ala He Pro Ala Val He Phe Ser 

1 5 
Ser Ser Asp Tyr Glu Leu Ala Ala 

15 20 
Trp Ala He Ala Ser He Thr Val 
30 35 
Thr Tyr Leu He Tyr Leu Leu Arg 
50 



<210> 329 
<211> 95 
55 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
60 <222> -27 . . -1 



<400> 329 

Met Thr Asp Gin Asp Arg He He Asn Leu Val Val Gly Ser Leu Thr 
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-25 

Ser Leu Leu lie 
-10 

Leu Pro Pro Lys 

5 

Ser Ser lie Thr 
25 

Glu Pro Lys Phe 
40 

10 Met Leu Cys lie 
55 



-20 

Leu Val Thr Leu 
-5 

Pro Leu Asn lie 
10 

Ala Cys lie lie 

Arg Lys Leu lie 
45 

Cys Ala Asn Leu 
60 



-15 

He Ser Ala Phe Val 
1 

Phe Phe Ala Val Cys 
15 

Tyr Trp Tyr Arg Gin 
30 

Tyr Tyr He He Phe 
50 

Tyr Phe His Asp Val 
65 



PCT/IB00/01938 

Phe Pro Gin 
5 

He Ser Leu 
20 

Gly Asp Leu 
35 

Ser He He 
Gly Arg 



<210> 330 

<211> 84 

15 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 

20 <222> -20 . . -1 



30 



35 



<400> 330 




























Met Ala Ala 


Ala 


Ala 


Val 


Pro 


Ser 


Leu 


Leu 


Leu 


Ser 


Leu 


Pro 


Pro 


His 


-20 






-15 










-10 










-5 


Gin Gly Leu 


Thr 


Phe 


Ser 


Asn 


Lys 


He 


Gin 


Pro 


Phe 


Gly 


Ala 


Gin 


Gly 






1 








5 










10 






Val Leu His 


Pro 


Glu 


Pro 


Gly 


Leu 


Arg 


Asp 


Trp 


Leu 


Leu 


Pro 


Thr 


Cys 


15 










20 










25 








Ser Arg Gin 


Leu 


Arg 


Val 


Ala 


Leu 


Pro 


Glu 


Lys 


Gly 


Ser 


Glu 


Gly 


Ser 


30 








35 










40 










Leu Cys Gin 


Thr 


Gin 


Leu 


Pro 


Ala 


Thr 


Pro 


Cys 


Phe 


Leu 


Pro 


Ser 


Asn 


45 






50 










55 










60 


Thr Val Arg 


Thr 


























<210> 331 




























<211> 124 




























<212> PRT 




























<213> Homo 


sapiens 

























40 <220> 

<221> SIGNAL 
<222> -32 . . -1 



<400> 331 
45 Met Val Val Val 
-30 

Trp Leu Tyr Ala 
-15 

Met Leu Ser Pro 
50 l 

Leu He Arg Cys 
20 

Lys Asn Ser Glu 
35 

55 Leu Glu Ser Pro 
50 

Gly Asn Asp Pro 
65 

Ser Glu Ala Glu 

60 



Glu Pro Gly Ala 
-25 

Val Phe Ala Val 
-10 

Phe Leu Leu Glu 

5 

Arg Tyr Ser Leu 

He Lys Met Asp 
40 

Arg Arg Gly Val 
55 

Leu Leu Phe Val 
70 

He Tyr Thr Pro 
85 



Ser Leu Phe Pro 

Leu Phe Val Phe 
-5 

He Asp Gin His 
10 

His Asn Thr Val 
25 

His Leu Glu Arg 

Leu Gly Gly Lys 
60 

Lys Val Thr Lys 
75 

Gly Pro Ser Val 
90 



Asn Gly Val Pro 
-20 

Phe Leu Phe Ala 

He Lys Lys Phe 
15 

His Lys Asp Lys 
30 

Pro Gly Cys Pro 
45 

Lys Asn Gly Met 

Glu Pro Arg Asp 
80 



<210> 332 
<211> 62 
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<212> PRT 

<213> Homo sapiens 

<220> 
5 <221> SIGNAL 
<222> -46. . -1 

<400> 332 

Met Asp Gin Leu Val Phe Lys Glu Thr lie Trp Asn Asp Ala Phe Trp 
10 -45 -40 -35 

Gin Asn Pro Trp Asp Gin Gly Gly Leu Ala Val lie lie Leu Phe lie 
-30 -25 -20 -15 

Thr Ala Val Leu Leu Leu lie Leu Phe Ala lie Val Phe Gly Leu Leu 
-10 -5 1 

15 Thr Ser Thr Glu Asn Thr Gin Cys Glu Ala Gly Glu Glu Glu 
5 10 15 

<210> 333 

<211> 150 

20 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
25 <222> -23 . . -1 

<400> 333 

Met Ser Asn Gin Arg Leu Pro Leu lie Phe Ser Leu Leu Phe lie Cys 
-20 -15 -10 

30 Phe Phe Gly Glu Ser Phe Cys lie Cys Asp Gly Thr Val Trp Thr Lys 
-5 15 
Val Gly Trp Glu lie Leu Pro Glu Glu Val His Tyr Trp Lys Gly Cys 
10 15 20 25 

Leu Tyr Leu lie Tyr Asn Leu Leu Gin Ala Val Phe Phe Val Leu Phe 
35 3 0 35 4 0 

Val Leu Ser Val His Tyr Leu Trp Lys Lys Trp Lys Lys His Gin Lys 

45 50 55 

Lys Leu Lys Lys Gin Ala Ser Leu Glu Lys Pro Gly Asn Asp Leu Glu 
60 65 70 

40 Ser Pro Leu lie Asn Asn lie Asp Gin Thr Leu His Arg Val Ala Thr 
75 80 85 

Thr Ala Ser Val lie Tyr Lys lie Trp Glu His Arg Ser His His Pro 
90 95 100 105 

Ser Ser Lys Lys lie Lys His Cys Lys Leu Lys Lys Lys Ser Lys Glu 
45 110 115 120 

Glu Gly Ala Arg Arg Tyr 
125 

<210> 334 
50 <211> 198 
<212> PRT 

<213> Homo sapiens 

<220> 
55 <221> SIGNAL 
<222> -13 . . -1 

<400> 334 

Met Leu Leu Gly Arg Leu Thr Ser Gin Leu Leu Arg Ala Val Pro Trp 
60 -10 -5 1 

Ala Gly Gly Arg Pro Pro Trp Pro Val Ser Gly Val Leu Gly Ser Arg 

5 10 15 

Val Cys Gly Pro Leu Tyr Ser Thr Ser Pro Ala Gly Pro Gly Arg Ala 
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20 








25 










30 










35 




Ala 


Ser 


Leu 


Pro Arg 
40 


Lys 


Gly 


Ala 


Gin 


Leu 
45 


Glu 


Leu 


Glu 


Glu 


Met 
50 


Val 




Pro 


Arg 


Lys 


Met Ser 


Val 


Ser 


Pro 


Leu 


Glu 


Ser 


Trp 


Leu 


Thr 


Ala 


Arg 


5 








55 








60 










65 








Cys 


Phe 


Leu 


Pro Arg 


Leu 


Asp 


Thr 


Gly Thr Ala 


Gly 


Thr 


Val 


Ala 


Pro 








70 








75 










80 










Pro 


Gin 


Ser 


Tyr Gin 


Cys 


Pro 


Pro 


Ser 


Gin 


He 


Gly Glu 


Gly Ala 


Glu 






85 








90 










95 










10 


Gin 


Gly 


Asp 


Glu Gly Val 


Ala 


Asp 


Ala 


Pro 


Gin 


He 


Gin 


Cys 


Lys 


Asn 




100 








105 










110 










115 




Val 


Leu 


Lys 


He Arg 
120 


Arg 


Arg 


Lys 


Met 


Asn 
125 


His 


His 


Lys 


Tyr 


Arg 
130 


Lys 




Leu 


Val 


Lys 


Lys Thr 


Arg 


Phe 


Leu 


Arg 


Arg 


Lys 


Val 


Gin 


Glu 


Gly 


Arg 


15 








135 








140 










145 








Leu 


Arg 


Arg 
150 


Lys Gin 


He 


Lys 


Phe 
155 


Glu 


Lys 


Asp 


Leu 


Arg 
160 


Arg 


He 


Trp 




Leu 


Lys 
165 


Ala 


Gly Leu 


Lys 


Glu 
170 


Ala 


Pro 


Glu 


Gly 


Trp 
175 


Gin 


Thr 


Pro 


Lys 


20 


He 
180 


Tyr 


Leu 


Arg Gly 


Lys 
185 























<210> 335 
<211> 88 
25 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
30 <222> -24 . . -1 

<400> 335 

Met Val Pro Leu Pro Lys Gin Ser Leu Lys Phe Phe Cys Ala Leu Glu 
-20 -15 -10 

35 Val Val Leu Pro Ser Cys Asp Cys Arg Ser Pro Gly He Gly Leu Val 
-5 15 
Glu Glu Pro Met Asp Lys Val Glu Glu Gly Pro Leu Ser Phe Leu Met 

10 15 20 

Lys Arg Lys Thr Ala Gin Lys Leu Ala He Gin Lys Ala Leu Ser Asp 
40 25 30 35 40 

Ala Phe Gin Lys Leu Leu He Val Val Leu Gly Lys Thr Val Leu He 

45 50 55 

He Leu Glu Val Leu Gin Phe Gin 
60 

45 

<210> 336 
<211> 150 
<212> PRT 

<213> Homo sapiens 

50 

<220> 

<221> SIGNAL 
<222> -45 . . -1 

55 <400> 336 

Met Val Leu Met Trp Thr Ser Gly Asp Ala Phe Lys Thr Ala Tyr Phe 
-45 -40 -35 -30 

Leu Leu Lys Gly Ala Pro Leu Gin Phe Ser Val Cys Gly Leu Leu Gin 
-25 -20 -15 

60 Val Leu Val Asp Leu Ala He Leu Gly Gin Ala Tyr Ala Phe Ala Pro 
-10 -5 1 

Pro Pro Glu Ala Gly Ala Pro Arg Arg Ala Pro His Trp His Gin Gly 
5 10 15 
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Pro Leu Thr Val 
20 

Leu Val Gly Pro 

5 Ala Gly Val Ala 
55 

Leu Trp Gly Val 
70 

Leu Cys Ser Ala 
10 85 

Asp Val Phe Arg 
100 



Gly Arg Thr Arg 
25 

Asp Leu Pro Ala 
40 

Glu Met Gly His 

Ser Gly Trp Ala 
75 

Gly Thr Ala Arg 
90 

Met Thr 
105 



Met Trp Asp Arg 
30 

Gly Arg Val Gly 
45 

Gly His Trp Gly 
60 

Val Gly Val Gly 

Val Asp Leu Ala 
95 



Gin Pro Arg Ala 
35 

Ala Val Ala Pro 
50 

Leu His Gin Pro 
65 

Leu Gly Arg Cys 
80 

Pro Arg Val Leu 



<210> 337 
15 <211> 142 
<212> PRT 

<213> Homo sapiens 

<220> 
20 <221> SIGNAL 
<222> -19. . -1 



<400> 337 





Met 


Ala 


Thr 


Ala 


Ser 


25 










-15 




Val 


Glu 


Ser 


Ala 


Gin 




Cys 


Phe 


Val 


1 

Tyr 


Gly 






15 








30 


Gly 


He 


Ser 


Gin 


He 




30 












Val 


Trp 


Asn 


Phe 


Pro 












50 




Gly 


Trp 


Pro 


Gin 


He 


35 








65 






Asn 


Asp 


Val 


Val 


Arg 








80 








Gly 


Arg 


His 


Lys 


Arg 






95 








40 


Lys 


Leu 


Gin 


Lys 


Phe 




110 











<210> 338 
<211> 112 
45 <212> PRT 

<213> Homo sapiens 



Pro Ser Val Phe Leu 
-10 

Phe Pro Glu Tyr Asp 
5 

Gin Asp Trp Ala Pro 
20 

Thr Ser Lys Ser Gin 
35 

He Asp Val Thr Phe 
55 

Val Leu Ser Val Tyr 
70 

Gly Tyr Gly Ala Val 
85 

Thr He Pro Met Phe 
100 

Thr Arg Ser Ala Ser 
115 



Leu Met Val Asn Gly Gin 
-5 

Asp Phe Tyr Cys Lys Tyr 
10 

Thr Ala Gly Leu Glu Glu 
25 

Asp Val Arg Gin Ala Leu 
40 45 
Lys Ser Thr Asn Pro Tyr 
60 

Gly Pro Asp Val Phe Gly 
75 

His Val Pro Phe Ser Pro 
90 

Val Pro Glu Ser Thr Ser 
105 

Cys Ser Thr His 
120 



<220> 

<221> SIGNAL 
50 <222> -27 . . -1 

<220> 

<221> UNSURE 
<222> 21 
55 <223> Xaa = Ala, Pro 



<400> 338 

Thr Ser Glu Glu Arg Thr Ala Met 
-25 -20 
60 Leu Cys Ser Asp Ser Leu Pro Glu 
-10 -5 
Ala Pro Asn Phe Ser Ser His Gly 
10 



Lys Arg Glu Gly Gly Ala Ala His 
-15 

Ser Gin Gin Gin Asp Gly Asn His 

1 5 
Ser Cys Arg Arg Arg Gin Arg Xaa 
15 20 
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Asp Met Thr Arg Arg Cys Met Pro Ala Arg Pro Gly Phe Pro Ser Ser 

25 30 35 

Pro Ala Pro Gly Ser Ser Pro Pro Arg Cys His Leu Arg Pro Gly Ser 
40 45 50 

5 Thr Ala His Ala Ala Ala Gly Lys Arg Thr Glu Ser Pro Gly Asp Arg 
55 60 65 

Tyr Arg Ala Glu Gly Leu Arg Arg Gly Arg Val Ala Gly Ala Arg Val 
70 75 80 85 

10 <210> 339 

<211> 90 

<212> PRT 

<213> Homo sapiens 

15 <220> 

<221> SIGNAL 
<222> -32 . . -1 

<400> 339 

20 Met Pro Cys Leu Asp Gin Gin Leu Thr Val His Ala Leu Pro Cys Pro 
-30 -25 -20 

Ala Gin Pro Ser Ser Leu Ala Phe Cys Gin Val Gly Phe Leu Thr Ala 

-15 -10 -5 

Gin Pro Ser Pro Pro Arg Arg Arg Asn Gly Lys Asp Arg Tyr Thr Leu 
25 1 5 10 15 

Val Leu Gin His Gin Glu Cys Gin Asp Asp Leu Ala Thr Ser Ser Leu 

20 25 30 

Val Tyr Leu Ser Leu Pro Cys Phe Lys Asp Leu Gly Arg Ser Lys His 
35 4 0 4 5 

30 Gin Ser lie Thr Val Ala Asp Thr Asn Lys 
50 55 

<210> 340 
<211> 80 
35 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
40 <222> -35 . . -1 

<400> 340 

Met Pro Phe Gin Phe Gly Thr Gin Pro Arg Arg Phe Pro Val Glu Gly 
-35 -30 -25 -20 

45 Gly Asp Ser Ser lie Glu Leu Glu Pro Gly Leu Ser Ser Ser Ala Ala 

-15 -10 -5 

Cys Asn Gly Lys Glu Met Ser Pro Thr Arg Gin Leu Arg Arg Cys Pro 

15 10 
Gly Ser His Cys Leu Thr lie Thr Asp Val Pro Val Thr Val Tyr Ala 

50 15 2 0 2 5 

Thr Thr Arg Lys Pro Pro Ala Gin Ser Ser Lys Glu Met His Pro Lys 
30 35 40 45 

<210> 341 
55 <211> 131 
<212> PRT 

<213> Homo sapiens 

<220> 
60 <221> SIGNAL 
<222> -15. . -1 

<400> 341 
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Met 


Ser 


Leu 


Leu 


Met 


Phe 


Thr 


Gin 


Leu 


Leu 


Leu 


Cys 


Gly 


Phe 


Leu 


Tyr 




-15 










-10 










-5 










1 




Val 


Arg 


Val 


Asp 
5 


Gly 


Ser 


Arg 


Leu 


Arg 
10 


Gin 


Glu 


Asp 


Phe 


Pro 
15 


Pro 


Arg 


5 


He 


Val 


Glu 


His 


Pro 


Ser 


Asp 


val 


He 


Val 


Ser 


Lys 


Gly Glu 


Pro 


Thr 








20 










25 










30 










Thr 


Leu 
35 


Asn 


Cys 


Lys 


Ala 


Glu 
40 


Gly 


Arg 


Pro 


Thr 


Pro 
45 


Thr 


He 


Glu 


Trp 




Tyr 


Lys 


Asp 


Gly 


Glu 


Arg 


Val 


Glu 


Thr 


Asp 


Lys 


Asp 


Asp 


Pro 


Arg 


Ser 


10 


50 










55 










60 










65 




His 


Arg 


Met 


Leu 


Leu 
70 


Pro 


Ser 


Gly 


Ser 


Leu 
75 


Phe 


Phe 


Leu 


Arg 


He 
80 


Val 




His 


Gly 


Arg 


Arg 
85 


Ser 


Lys 


Pro 


Asp 


Glu 
90 


Gly 


Ser 


Tyr 


Val 


Cys 
95 


Val 


Ala 


15 


Arg 


Asn 


Tyr 


Leu 


Gly 


Glu 


Ala 


Val 


Ser 


Arg 


Asn 


Ala 


Ser 


Leu 


Glu 


Val 



100 105 110 



Ala Cys Lys 
115 



20 <210> 342 

<211> 99 

<212> PRT 

<213> Homo sapiens 

25 <220> 

<221> SIGNAL 

<222> -39..-1 





<400> 342 




























30 


Met 


Asp 


Leu 


He 


Gly 
-35 


Phe 


Gly 


Tyr 


Ala 


Ala 
-30 


Leu 


Val 


Thr 


Phe 


Gly 
-25 


Ser 




He 


Phe 


Gly 


Tyr 


Lys 


Arg Arg Gly Gly Val 


Pro 


Ser 


Leu 


He 


Ala 


Gly 










-20 










-15 










-10 








Leu 


Phe 


Val 


Gly 


Cys 


Leu 


Ala 


Gly 


Tyr 


Gly Ala 


Tyr 


Arg 


Val 


Ser 


Asn 


35 






-5 










1 








5 












Asp 


Lys 


Arg 


Asp 


Val 


Lys 


Val 


Ser 


Leu 


Phe 


Thr 


Ala 


Phe 


Phe 


Leu 


Ala 




10 










15 










20 










25 




Thr 


He 


Met 


Gly 


Val 
30 


Arg 


Phe 


Lys 


Arg 


Ser 
35 


Lys 


Lys 


He 


Met 


Pro 
40 


Ala 


40 


Gly 
Leu 


Leu 
Leu 


Val 

Leu 
60 


Ala 
45 


Gly 


Leu 


Ser 


Leu 


Met 
50 


Met 


He 


Leu 


Arg 


Leu 
55 


Val 


Leu 



45 <210> 343 

<211> 98 

<212> PRT 

<213> Homo sapiens 

50 <220> 

<221> SIGNAL 
<222> -43 . . -1 

<400> 343 

55 Met Cys Glu Thr Leu Leu Thr Ser Lys Trp Ala Ser Val Ser Pro He 
-40 -35 -30 

Pro Ala Leu Leu Gin Glu Gly Glu Asn Arg Asp Ser Arg Arg Leu Gly 

-25 -20 -15 

Asp Ala Leu Leu Phe Leu Arg Pro Ala Gly Ser Cys Ala Leu Gin Val 
60 -10 -5 1 5 

Ser Trp Pro Ala Ala Leu Ala Gly Pro Arg Ser His Thr Gly Gin Leu 

10 15 2 0 

Thr Gin His Phe Cys His Leu Lys Asn Asp Thr Cys He Pro Pro Ser 
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25 30 35 

Leu Gly Pro Pro Arg Asn Ser Gly Ser Leu Glu Ser Leu Arg Ser Lys 
40 45 50 

Arg Tyr 
5 55 

<210> 344 
<211> 217 
<212> PRT 
10 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -19 . . -1 



15 



20 



<220> 

<221> UNSURE 
<222> 185 

<223> Xaa = Phe,Val 



<400> 344 

Met Val Gly lie Leu Pro Leu Cys Cys Ser Gly Cys Val Pro Ser Leu 

-15 -10 -5 

Cys Cys Ser Ser Tyr Val Pro Ser Val Ala Pro Thr Ala Ala His Ser 
25 15 10 

Val Arg Val Pro His Ser Ala Gly His Cys Gly Gin Arg Val Leu Ala 

15 20 25 

Cys Ser Leu Pro Gin Val Phe Leu Lys Pro Trp He Phe Val Glu His 
30 35 40 45 

30 Phe Ser Ser Trp Leu Ser Leu Glu Leu Phe Ser Phe Leu Arg Tyr Leu 

50 55 60 

Gly Thr Leu Leu Cys Ala Cys Gly His Arg Leu Arg Glu Gly Arg Leu 

65 70 75 

Leu Pro Cys Leu Leu Gly Val Gly Ser Trp Leu Leu Phe Asn Asn Trp 
35 80 85 90 

Thr Gly Gly Ser Trp Phe Ser Leu His Leu Gin Gin Val Ser Leu Ser 

95 100 105 

Gin Gly Ser His Val Ala Ala Phe Leu Pro Glu Ala He Gly Pro Gly 
110 115 120 125 

40 Val Pro Val Pro Val Ser Gly Glu Ser Thr Ser Ala Gin Gin Ser His 

130 135 140 

Ala Gly Trp Gin Leu Ser Ala Glu Ala Asp Ala Cys Pro Ser Val Leu 

145 150 155 

Tyr Ser Glu Val Leu Glu Trp Asn Lys Asn He Asn Thr Tyr Thr Ser 
45 160 165 170 

Phe His Asp Phe Cys Leu He Leu Gly He Phe Xaa Val Leu Phe Cys 

175 180 185 

Phe Gly Gly Asp Arg Leu Thr Leu His 
190 195 

50 

<210> 345 
<211> 183 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SIGNAL 
<222> -20 . . -1 

60 <400> 345 

Met Lys Leu Leu Ser Leu Val Ala Val Val Gly Cys Leu Leu Val Pro 
-20 -15 -10 -5 

Pro Ala Glu Ala Asn Lys Ser Ser Glu Asp He Arg Cys Lys Cys He 
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15 10 





Cys 


Pro 


Pro 
15 


Tyr 


Arg 


Asn 


lie 


Ser 
20 


Gly 


His 


He 


Tyr 


Asn 
25 


Gin 


Asn 


Val 




Ser 


Gin 


Lys 


Asp 


Cys 


Asn 


Cys 


Leu 


His 


Val 


Val 


Glu 


Pro 


Met 


Pro 


Val 


5 




30 










35 










40 












Pro 


Gly 


His 


Asp 


Val 


Glu 


Ala 


Tyr 


Cys 


Leu 


Leu 


Cys 


Glu 


Cys 


Arg 


Tyr 




45 










50 










55 










60 




Glu 


Glu 


Arg 


Ser 


Thr 
65 


Thr 


Thr 


He 


Lys 


Val 
70 


He 


He 


Val 


He 


Tyr 
75 


Leu 


10 


Ser 


Val 


Val 


Gly 
80 


Ala 


Leu 


Leu 


Leu 


Tyr 
85 


Met 


Ala 


Phe 


Leu 


Met 
90 


Leu 


Val 




Asp 


Pro 


Leu 
95 


He 


Arg 


Lys 


Pro 


Asp 
100 


Ala 


Tyr 


Thr 


Glu 


Gin 
105 


Leu 


His 


Asn 




Glu 


Glu 


Glu 


Asn 


Glu 


Asp 


Ala 


Arg 


Ser 


Met 


Ala 


Ala 


Ala 


Ala 


Ala 


Ser 


15 




110 










115 










120 












Leu 


Gly 


Gly 


Pro 


Arg 


Ala 


Asn 


Thr 


Val 


Leu 


Glu 


Arg 


Val 


Glu 


Gly 


Ala 




125 










130 










135 










140 




Gin 


Gin 


Arg 


Trp 


Lys 
145 


Leu 


Gin 


Val 


Gin 


Glu 
150 


Gin 


Arg 


Lys 


Thr 


Val 
155 


Phe 


20 


Asp 


Arg 


His 


Lys 
160 


Met 


Leu 


Ser 





















<210> 346 

<211> 247 

25 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 

30 <222> -13 . . -1 



<400> 346 





Met 


Leu 


Val 


Leu 


Arg 










-10 




35 


Leu 


Ala 


Pro 


Gin 


Met 




Asp 


5 

Gly 


He 


Phe 


Tyr 




20 












Met 


Asn 


Glu 


Phe 


Leu 


40 










40 




Ala 


His 


Ser 


Glu 


Leu 










55 






Met 


Asn 


Thr 


Val 


Phe 








70 






45 


Thr 


Glu 


Val 


Gin 


Lys 






85 










Phe 


Leu 


He 


Pro 


Asn 




100 












Thr 


Tyr 


Leu 


Val 


Pro 


50 










120 




Val 


Tyr 


Glu 


Leu 


Ala 










135 






Trp 


Gly 


Asp 


Ala 


Phe 








150 






55 


Tyr 


Thr 


Lys 


Leu 


Val 






165 










Arg 


Val 


His 


Val 


Leu 




180 












Gly 


Arg 


His 


Lys 


Ser 


60 










200 




Glu 


Ser 


Val 


Asn 


Tyr 










215 






Thr 


Ser 


Phe 


Ser 


Pro 



Ser Ala Leu Thr Arg 
-5 

Cys Ser Ser Phe Ala 
10 

Glu Phe Arg Ser Tyr 
25 

Glu Asn Phe Glu Lys 
45 

Val Gly Tyr Trp Ser 
60 

His He Trp Lys Tyr 
75 

Ala Leu Ala Lys Asp 
90 

Leu Ala Leu He Asp 
105 

Trp Cys Lys Leu Glu 
125 

Thr Phe Gin Met Lys 
140 

Lys Arg Ala Val His 
155 

Gly Val Phe His Thr 
170 

Trp Trp Asn Glu Ser 
185 

His Glu Asp Pro Arg 
205 

Leu Val Ser Gin Gin 
220 

Leu Lys 



Ala 


Leu 


Ala 


Ser 


Arg 


Thr 


Thr 


Gly 


Pro 


1 

Arg 


Gin 


Tyr 




15 
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Thr 
90 


Thr 


Gly 


Thr 


Arg 


His 
95 


His 


Ala 




Arg 


Leu 


Phe 


Cys 


He 


He 


Ser 


Arg 


Asp 


Glu 


Val 


Ser 


Pro 


Tyr 


Trp 


Pro 


35 






100 










105 










110 










Gly 


Trp 
115 


Ser 


Arg 


Thr 


Pro 


Asn 
120 


Leu 


Val 


He 


His 


Leu 
125 


Pro 


Gin 


Pro 


Pro 




Lys 


Val 


Leu 


Gly 


Leu 


Pro 


Ala 






















130 










135 























45 



40 

<210> 358 

<211> 102 

<212> PRT 

<213> Homo sapiens 
<220> 

<221> SIGNAL 

<222> -14 . . -1 

50 <400> 358 

Met Phe Leu Thr Ala Leu Leu Trp Arg Gly Arg He Pro Gly Arg Gin 

-10 -5 1 

Trp He Gly Lys His Arg Arg Pro Arg Phe Val Ser Leu Arg Ala Lys 
5 10 15 

55 Gin Asn Met He Arg Arg Leu Glu He Glu Ala Glu Asn His Tyr Trp 
20 25 30 

Leu Ser Met Pro Tyr Met Thr Arg Glu Gin Glu Arg Gly His Ala Ala 
35 40 45 50 

Val Arg Arg Arg Glu Ala Phe Glu Ala He Lys Ala Ala Ala Thr Ser 
60 55 60 65 

Lys Phe Pro Pro His Arg Phe He Ala Asp Gin Leu Asp His Leu Asn 

70 75 80 

Val Thr Lys Lys Trp Ser 
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85 



<210> 359 
<211> 244 
<212> PRT 
<213> Homo sapiens 



<220> 

<221> SIGNAL 

10 <222> -29. . -1 

<400> 359 





Met 


Glu 


Leu 


Thr 


He 

-25 


Phe 


He 


Leu 


Arg 


Leu 
-20 


Ala 


He 


Tyr 


He 


Leu 

-15 


Thr 


15 


Phe 


Pro 


Leu 


Tyr 
-10 


Leu 


Leu 


Asn 


Phe 


Leu 
-5 


Gly 


Leu 


Trp 


Ser 


Trp 
1 


He 


Cys 




Lys 


Lys 
5 


Trp 


Phe 


Pro 


Tyr 


Phe 
10 


Leu 


Val 


Arg 


Phe 


Thr 
15 


Val 


He 


Tyr 


Asn 




Glu 


Gin 


Met 


Ala 


Ser 


Lys 


Lys 


Arg 


Glu 


Leu 


Phe 


Ser 


Asn 


Leu 


Gin 


Glu 


20 


20 










25 










30 










35 




Phe 


Ala 


Gly 


Pro 


Ser 

40 


Gly 


Lys 


Leu 


Ser 


Leu 
45 


Leu 


Glu 


Val 


Gly 


Cys 
50 


Gly 




Thr 


Gly Ala 


Asn 


Phe 


Lys 


Phe 


Tyr 


Pro 


Pro Gly 


Cys 


Arg 


Val 


Thr 


Cys 










55 










60 










65 






25 


He 


Asp 


Pro 
70 


Asn 


Pro 


Asn 


Phe 


Glu 
75 


Lys 


Phe 


Leu 


He 


Lys 
80 


Ser 


He 


Ala 




Glu 


Asn 
85 


Arg 


His 


Leu 


Gin 


Phe 
90 


Glu 


Arg 


Phe 


Val 


Val 
95 


Ala 


Ala 


Gly 


Glu 




Asn 


Met 


His 


Gin 


Val 


Ala 


Asp 


Gly 


Ser 


Val 


Asp 


Val 


Val 


Val 


Cys 


Thr 


30 


100 










105 










110 










115 




Leu 


Val 


Leu 


Cys 


Ser 
120 


Val 


Lys 


Asn 


Gin 


Glu 
125 


Arg 


He 


Leu 


Arg 


Glu 
130 


Val 




Cys 


Arg 


Val 


Leu 


Arg 


Pro Gly 


Gly Ala 


Phe 


Tyr 


Phe 


Met 


Glu 


His 


Val 










135 










140 










145 






35 


Ala 


Ala 


Glu 
150 


Cys 


Ser 


Thr 


Trp 


Asn 
155 


Tyr 


Phe 


Trp 


Gin 


Gin 
160 


Val 


Leu 


Asp 




Pro 


Ala 
165 


Trp 


His 


Leu 


Leu 


Phe 
170 


Asp 


Gly 


Cys 


Asn 


Leu 
175 


Thr 


Arg 


Glu 


Ser 




Trp 


Lys 


Ala 


Leu 


Glu 


Arg 


Ala 


Ser 


Phe 


Ser 


Lys 


Leu 


Lys 


Leu 


Gin 


His 


40 


180 










185 










190 










195 




He 


Gin 


Ala 


Pro 


Leu 
200 


Ser 


Trp 


Glu 


Leu 


Val 
205 


Arg 


Pro 


His 


He 


Tyr 
210 


Gly 



Tyr Ala Val Lys 
215 

45 

<210> 360 
<211> 177 
<212> PRT 
<213> Homo sapiens 

50 

<220> 

<221> SIGNAL 
<222> -23 . . -1 



55 



<400> 360 


























Met Ser Asn Gin 


Arg 


Leu 


Pro 


Leu 


He 


Phe 


Ser 


Leu 


Leu 


Phe 


He 


Cys 


-20 










-15 










-10 






Phe Phe Gly Glu 


Ser 


Phe 


Cys 


He 
1 


Cys 


Asp 


Gly 


Thr 
5 


Val 


Trp 


Thr 


Lys 


-5 

Val Gly Trp Glu 


He 


Leu 


Pro 


Glu 


Glu 


Val 


His 


Tyr 


Trp 


Lys 


Val 


Lys 


10 




15 










20 










25 


Gly Ser Pro Ser 


His 


Cys 


Leu 


Pro 


Tyr 


Leu 


Leu 


Asp 


Lys 


Leu 


Cys 


Cys 




30 










35 










40 
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10 



Asp 


Phe 


Ala 


Asn 


Met 


Asp 


He 


Phe 


Gin 


Gly 


Cys 


Leu 


Tyr 


Leu 


He 


Tyr 








45 










50 










55 






Asn 


Leu 


Leu 


Gin 


Ala 


Val 


Phe 


Phe 


Val 


Leu 


Phe 


Val 


Leu 


Ser 


Val 


His 






60 










65 










70 








Tyr 


Leu 


Trp 


Lys 


Lys 


Trp 


Lys 


Lys 


His 


Gin 


Lys 


Lys 


Leu 


Lys 


Lys 


Gin 




75 










80 










85 










Ala 


Ser 


Leu 


Glu 


Lys 


Pro 


Gly 


Asn 


Asp 


Leu 


Glu 


Ser 


Pro 


Leu 


He 


Asn 


90 










95 










100 










105 


Asn 


He 


Asp 


Gin 


Thr 


Leu 


His 


Arg 


Val 


Ala 


Thr 


Thr 


Ala 


Ser 


Val 


He 










110 










115 










120 




Tyr 


Lys 


He 


Trp 


Glu 


His 


Arg 


Ser 


His 


His 


Pro 


Ser 


Ser 


Lys 


Lys 


He 








125 










130 










135 






Lys 


His 


Cys 


Lys 


Leu 


Lys 


Lys 


Lys 


Ser 


Lys 


Glu 


Glu 


Gly Ala 


Arg 


Arg 






140 










145 










150 








Tyr 

































<210> 361 
<211> 158 
<212> PRT 
20 <213> Homo sapiens 



<220> 

<221> SIGNAL 
<222> -21 ♦ . -1 

25 

<400> 361 



Met 


Ala 
-20 


Leu 


Cys 


Ala 


Leu 


Thr 
-15 


Arg 


Ala 


Leu 


Pro 


Ser 
-10 


Leu 


Asn 


Leu 


Ala 


Pro 


Pro 


Thr 


Val 


Ala 


Ala 


Pro 


Ala 


Pro 


Ser 


Leu 


Phe 


Pro 


Ala 


Ala 


Gin 


-5 










1 








5 










10 




Met 


Met 


Asn 


Asn 
15 


Gly 


Leu 


Leu 


Gin 


Gin 
20 


Pro 


Ser 


Ala 


Leu 


Met 
25 


Leu 


Leu 


Pro 


Cys 


Arg 
30 


Pro 


Val 


Leu 


Thr 


Ser 
35 


Val 


Ala 


Leu 


Asn 


Ala 
40 


Asn 


Phe 


Val 


Ser 


Trp 
45 


Lys 


Ser 


Arg 


Thr 


Lys 
50 


Tyr 


Thr 


He 


Thr 


Pro 
55 


Val 


Lys 


Met 


Arg 


Lys 


Ser 


Gly Gly Arg Asp 


His 


Thr 


Gly Ala 


Gly Asn Val 


Arg 


Arg 


Thr 


60 










65 










70 










75 


Val 


Gly 


Arg 


Val 


Ser 
80 


Asn 


Val 


Asp 


His 


Asn 
85 


Lys 


Arg 


Val 


He 


Gly 
90 


Lys 


Ala 


Gly 


Arg 


Asn 
95 


Arg 


Trp 


Leu 


Gly 


Lys 
100 


Arg 


Pro 


Asn 


Ser 


Gly 
105 


Arg 


Trp 


His 


Arg 


Lys 


Gly Gly Trp 


Ala 


Gly 


Arg 


Lys 


He 


Arg 


Pro 


Leu 


Pro 


Pro 






110 










115 










120 








Met 


Lys 
125 


Ser 


Tyr 


Val 


Lys 


Leu 
130 


Pro 


Ser 


Ala 


Ser 


Ala 
135 


Gin 


Ser 







<210> 362 

<211> 186 

50 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 

55 <222> -19. . -1 



<400> 362 

Met Ala Thr Ala Ser Pro Ser Val Phe Leu Leu Met Val Asn Gly Gin 
-15 -10 -5 

60 Val Glu Ser Ala Gin Phe Pro Glu Tyr Asp Asp Leu Tyr Cys Lys Tyr 
1 5 10 

Cys Phe Val Tyr Gly Gin Asp Trp Ala Pro Thr Ala Gly Leu Glu Glu 
15 20 25 

308 



BNSDOCID: <WO 01 42451 A2J_> 



WO 01/42451 



PCT/1B00/01938 



Gly lie Ser Gin lie Thr Ser Lys Ser Gin Asp Val Arg Gin Ala Leu 
30 35 40 45 

Val Trp Asn Phe Pro lie Asp Val Thr Phe Lys Ser Thr Asn Pro Tyr 
50 55 60 

5 Gly Trp Pro Gin lie Val Leu Ser Val Tyr Gly Pro Asp Val Phe Gly 
65 70 75 

Asn Asp Val Val Arg Gly Tyr Gly Ala Val His Val Pro Phe Ser Pro 

80 85 90 

Gly Arg His Lys Arg Thr lie Pro Met Phe Val Pro Glu Ser Thr Ser 
10 95 100 105 

Lys Leu Gin Lys Phe Thr Ser Trp Phe Met Gly Arg Arg Pro Glu Tyr 
110 115 120 125 

Thr Asp Pro Lys Val Val Ala Gin Gly Glu Gly Arg Glu Ala lie Thr 
130 135 140 

15 Ala Pro Arg Lys Ala Val Phe Ser Val His Gly Leu Thr Ser Pro Arg 
145 150 155 

Ala Leu Ala Leu Val His lie Lys Gly Thr 
160 165 

20 <210> 363 
<211> 150 
<212> PRT 

<213> Homo sapiens 

25 <220> 

<221> SIGNAL 
<222> -47 . . -1 

<400> 363 

30 Met Gly Asp Arg Val Lys Gly Ser Lys Ser Arg Ala Phe Val Ser Pro 
-45 -40 -35 

Trp Pro His Thr Pro Met Ala Ser Gly Leu Arg Asp Pro Trp Leu Gin 

-30 -25 -20 

Pro Thr Ala Leu Gly Leu Ala Leu Cys Ser Thr Lys Ala Leu Ser Val 
35 -15 -10 -5 1 

Gly Ser Ala Pro Leu Pro Pro Arg Asn Ser Asn Thr Met Ala Ala Ala 

5 10 15 

Ala Leu Ala Ala Pro Ser Leu Gly Phe Asp Gly Val lie Gly Val Leu 
20 25 30 

40 Val Ala Asp Thr Ser Leu Thr Asp Met His Val Val Asp Val Glu Leu 
35 40 45 

Ser Gly Pro Arg Gly Pro Thr Gly Arg Ser Phe Ala Val His Thr Arg 
50 55 60 65 

Arg Glu Asn Pro Ala Glu Pro Gly Ala Val Thr Gly Ser Ala Thr Val 
45 70 75 80 

Thr Ala Phe Trp Arg Ser Leu Leu Ala Cys Cys Gin Leu Pro Ser Arg 

85 90 95 

Pro Gly lie His Leu Cys 
100 

50 

<210> 364 
<211> 95 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SIGNAL 
<222> -45. . -1 

60 <400> 364 

Met Leu His His Val lie Thr Ala Gly Pro Val Leu Leu Leu His Leu 

-45 -40 -35 -30 

Pro Arg Pro Asp Thr Ser Thr Arg Leu Leu Leu Thr Ser Val Ser Ala 

309 



55 
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-25 -20 -15 

Phe lie Leu Leu Leu Leu Leu Ser Gly Pro Ala Glu Met Ser Ala Ser 

-10 -5 1 

Gin Glu Ser Phe Pro Gly Ser Leu Gin Gin Glu lie Ala Ser Leu lie 
5 5 10 15 

Thr Val Ala Leu Gly Ser Leu lie Ser Leu Ser Cys Ser Thr Leu Leu 
20 25 30 35 

Tyr Phe Ser Cys Glu Leu Lys lie Pro Cys Glu Asp Val Asn Leu 
40 45 50 

10 

<210> 365 
<211> 94 
<212> PRT 

<213> Homo sapiens 

15 

<220> 

<221> SIGNAL 
<222> -26 . . -1 

20 <400> 365 

Met Ala Ala lie Glu lie Glu Val Lys Pro Asn Gin Gly Phe Cys Gly 

-25 -20 -15 

Ser Ala Cys Leu Leu Ala Val lie Arg Ala Phe Phe Phe Lys Lys Asn 

-10 -5 15 

25 Ala Cys Leu Leu Arg Glu lie Leu Gin Ser Lys Leu Gly Gly Met Gly 
10 15 20 

Pro Val Val Phe Ser Tyr Arg Gly Leu Pro Leu Trp Leu Phe Ala Trp 

25 30 35 

Leu Phe Pro Arg Cys Thr Val Pro Leu Thr Phe Gly Phe Glu Asn Met 

30 40 45 50 

Arg Gly Leu Gly Val Val Ala Tyr Ala Cys Asn Pro Ser Thr 

55 60 65 

<210> 366 
35 <211> 140 
<212> PRT 

<213> Homo sapiens 

<220> 
40 <221> SIGNAL 
<222> -40 . . -1 

<400> 366 

Met Thr Ser Met Thr Gin Ser Leu Arg Glu Val lie Lys Ala Met Thr 
45 -40 -35 -30 -25 

Lys Ala Arg Asn Phe Glu Arg Val Leu Gly Lys lie Thr Leu Val Ser 

-20 -15 -10 

Ala Ala Pro Gly Lys Val lie Cys Glu Met Lys Val Glu Glu Glu His 
-5 1 5 

50 Thr Asn Ala lie Gly Thr Leu His Gly Gly Leu Thr Ala Thr Leu Val 
10 15 20 

Asp Asn lie Ser Thr Met Ala Leu Leu Cys Thr Glu Arg Gly Ala Pro 
25 30 35 40 

Gly Val Ser Val Asp Met Asn He Thr Tyr Met Ser Pro Ala Lys Leu 
55 45 50 55 

Gly Glu Asp He Val He Thr Ala His Val Leu Lys Gin Gly Lys Thr 

60 65 70 

Leu Ala Phe Thr Ser Val Asp Leu Thr Asn Lys Ala Thr Gly Lys Leu 
75 80 85 

60 He Ala Gin Gly Arg His Thr Lys His Leu Gly Asn 
90 95 100 

<210> 367 
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<211> 39 

<212> PRT 

<213> Homo sapiens 

5 <220> 

<221> SIGNAL 
<222> -35 . . -1 

<400> 367 

10 Met Asp Pro Gly Trp Pro His Phe Lys Leu Thr His Ser Arg Cys Met 
-35 -30 -25 -20 

Ala Val Leu Phe Leu Gly Thr Leu Pro Leu Cys Pro Val Thr Ser Pro 

-15 -10 -5 

Val Trp Gly Trp Ser Pro Gly 

15 1 

<210> 368 

<211> 78 

<212> PRT 

20 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -41 . . -1 

25 

<400> 368 

Met Ser Ala Ser Val Val Ser Val lie Ser Arg Phe Leu Glu Glu Tyr 

-40 -35 -30 

Leu Ser Ser Thr Pro Gin Arg Leu Lys Leu Leu Asp Ala Tyr Leu Leu 
30 -25 -20 -15 -10 

Tyr lie Leu Leu Thr Gly Ala Leu Gin Phe Gly Tyr Cys Leu Leu Val 

-5 15 
Gly Thr Phe Pro Phe Asn Ser Phe Leu Ser Gly Phe lie Ser Cys Val 
10 15 20 

35 Gly Ser Phe lie Leu Ala Gly Ser Leu Phe Glu Phe Pro Gly 
25 30 35 

<210> 369 
<211> 83 
40 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
45 <222> -40 . - -1 

<400> 369 

Met Gly Leu Thr Ser Thr Trp Arg Tyr Gly Arg Gly Pro Gly lie Gly 

-40 -35 -30 -25 

50 Thr Val Thr Met Val Ser Trp Gly Arg Phe lie Cys Leu Val Val Val 

-20 -15 -10 

Thr Met Ala Thr Leu Ser Leu Ala Arg Pro Ser Phe Ser Leu Val Glu 
-5 15 

Asp Thr Thr Leu Glu Pro Glu Asp Ala lie Ser Ser Gly Asp Asp Glu 
55 10 15 20 

Asp Asp Thr Asp Gly Ala Glu Asp Phe Val Ser Glu Asn Ser Asn Asn 

25 30 35 40 

Lys Ser Lys 

60 <210> 370 
<211> 92 
<212> PRT 
<213> Homo sapiens 
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<220> 

<221> SIGNAL 
<222> -15 . . -1 

<400> 370 





Met 


Ala 


Val 


Leu 


Ala 


Gly 


Ser 


Leu 


Leu 


Gly 


Pro 


i nr 


Ser 


Arg 


Ser 


Ala 




-15 










-10 










-5 










1 




Ala 


Leu 


Leu 


Gly 


Gly Arg 


Trp 


Leu 


Gin 


Pro 


Arg 


Ala 


Trp 


Leu 


Gly 


Phe 


1 A 

10 








5 










10 










15 








Pro 


Asp 


Ala 


Trp 


Gly Leu 


Pro 


Thr 


Pro 


Gin 


Gin 


Ala 


Arg 


Gly 


Lys 


Ala 








20 










25 










30 










Arg 


Gly Asn 


Glu 


Tyr 


Gin 


Pro 


Ser 


Asn 


He 


Lys 


Arg 


Lys 


Asn 


Lys 


His 






35 










40 










45 










15 


Gly 


Trp 


Val 


Arg 


Arg 


Leu 


Ser 


Thr 


Pro 


Ala 


Gly 


Val 


Gin 


Val 


He 


Leu 




50 










55 










60 










65 




Arg 


Arg 


Met 


Leu 


Lys 


Gly Arg 


Lys 


Ser 


Leu 


Ser 


His 




















70 










75 














20 


<210> 371 






























<211> 279 






























<212> PRT 






























<213> Homo sapiens 
























25 


<220> 
































<221> SIGNAL 




























<222> -42 . . - 


-1 




























<400> 371 




























30 


Met 


Ala 


Ala 


Pro 


Val 


Arg 


Arg 


Thr 


Leu 


Leu 


Gly 


Val 


Ala 


Gly 


Gly 


Trp 








-40 










-35 










-30 










Arg 


Arg 


Phe 


Glu 


Arg 


Leu 


Trp 


Ala 


Gly 


Ser 


Leu 


Ser 


Ser 


Arg 


Ser 


Leu 






-25 










-20 










-15 












Ala 


Leu 


Ala 


Ala 


Ala 


Pro 


Ser 


Ser 


Asn 


Gly 


Ser 


Pro 


Trp 


Arg 


Leu 


Leu 


35 


-10 










-5 










1 








5 






Gly 


Ala 


Leu 


Cys 


Leu 


Gin 


Arg 


Pro 


Pro 


Val 


Val 


Ser 


Lys 


Pro 


Leu 


Thr 










10 










15 










20 








Pro 


Leu 


Gin 


Glu 


Glu 


Met 


Ala 


Ser 


Leu 


Leu 


Gin 


Gin 


He 


Glu 


He 


Glu 








25 










30 










35 








40 


Arg 


Ser 


Leu 


Tyr 


Ser 


Asp 


His 


Glu 


Leu 


Arg 


Ala 


Leu 


Asp 


Glu 


Asn 


Gin 






40 










45 










50 












Arg 


Leu 


Ala 


Lys 


Lys 


Lys 


Ala 


Asp 


Leu 


His 


Asp 


Glu 


Glu 


Asp 


Glu 


Gin 




55 










60 










65 










70 




Asp 


He 


Leu 


Leu 


Ala 


Gin 


Asp 


Leu 


Glu 


Asp 


Met 


Trp 


Glu 


Gin 


Lys 


Phe 


45 










75 










80 










85 






Leu 


Gin 


Phe 


Lys 


Leu 


Gly 


Ala 


Arg 


He 


Thr 


Glu 


Ala 


Asp 


Glu 


Lys 


Asn 










90 










95 










100 








Asp 


Arg 


Thr 


Ser 


Leu 


Asn 


Arg 


Asn 


Leu 


Asp 


Arg 


Asn 


Leu 


val 


Leu 


Leu 








105 










110 










115 








50 


Val 


Arg 


Glu 


Lys 


Phe Gly Asp Gin Asp 


Val 


Trp 


lie 


Leu 


Pro 


Gin 


Ala 






120 










125 










130 












Glu 


Trp 


Gin 


Pro 


Gly Glu 


Thr 


Leu 


Arg 


Gly 


Thr 


Ala 


Glu 


Arg 


Thr 


Leu 




135 










140 










145 










150 




Ala 


Thr 


Leu 


Ser 


Glu 


Asn 


Asn 


Met 


Glu 


Ala 


Lys 


Phe 


Leu 


Gly 


Asn 


Ala 


55 










155 










160 










165 






Pro 


Cys 


Gly 


His 


Tyr 


Thr 


Phe 


Lys 


Phe 


Pro 


Gin 


Ala 


Met 


Arg 


Thr 


Glu 










170 










175 










180 








Ser 


Asn 


Leu 


Gly 


Ala 


Lys 


Val 


Phe 


Phe 


Phe 


Lys 


Ala 


Leu 


Leu 


Leu 


Thr 








185 










190 










195 








60 


Gly 


Asp 


Phe 


Ser 


Gin 


Ala 


Gly 


Asn 


Lys 


Gly 


His 


His 


Val 


Trp 


Val 


He 






200 










205 










210 












Lys 


Asp 


Glu 


Leu 


Gly Asp 


Tyr 


Leu 


Lys 


Pro 


Lys 


Tyr 


Leu 


Ala 


Gin 


Val 




215 










220 










225 










230 
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Arg Arg Phe Val Ser Asp Leu 
235 

<210> 372 
5 <211> 184 
<212> PRT 

<213> Homo sapiens 

<220> 
10 <221> SIGNAL 
<222> -31. .-1 

<400> 372 

Met Ala Cys Thr Thr Thr Ala Pro Ala Gin Glu His Met Leu Leu Thr 
15 -30 -25 -20 

Pro Leu Thr Ala Leu Met Val Gly Ala Ala Ser Leu Leu Glu Gly Arg 
-15 -10 -5 1 

Pro Gin lie Ser Ala Pro Tyr Ser Arg Ala Ala Cys Cys Ser Pro Gly 
5 10 15 

20 Ala Leu Gly Cys Pro Ala Ala Arg Val Gly lie Leu Asp Leu Met Tyr 
20 25 30 

Ser Trp Val Ala Arg Lys Val Leu Arg Cys Ser Asn Thr Gly Leu Gin 

35 40 45 

Gly Leu His Cys Ala Pro Ala Tyr Ala Ala Gin Leu Gly Met Asp Pro 
25 50 55 60 65 

Gly Arg Gly Gin Arg Ala Gly Gly Pro Val Glu Gin Thr Tyr Phe Ser 

70 75 80 

Pro Met Gly Lys Leu Pro Thr Leu Ser Trp Leu Glu Gly Cys Thr Ala 
85 90 95 

30 Val Met Thr Leu Ala Ser Ala Trp Leu Leu Gly Ser Pro Arg Glu Thr 
100 105 110 

Tyr Asn His Glu Lys Val Lys Glu Lys Gin Cys Pro Phe Ser Ser Met 

115 120 125 

Val Leu Gly Glu Tyr Gly Phe Leu Pro Thr Val Asp His Leu Ser Thr 
35 130 135 140 145 

Leu Gly Cys Asn Met Arg Glu Leu 
150 

<210> 373 
40 <211> 101 
<212> PRT 
<213> Homo sapiens 

<220> 
45 <221> SIGNAL 
<222> -42 . . -1 

<400> 373 

Met Ala His Val Ala Glu Lys Asp Gly Leu Asp Trp Ala Ser Gly Cys 
50 -40 -35 -30 

lie Pro Gly Leu Gin Thr Gly lie Cys Leu Phe Gly Ser Gin Leu Cys 

-25 -20 -15 

Phe His Leu Ser Trp Leu Tyr Ser Trp Ala Ser Gin Cys Gly Pro Thr 
-10 -5 15 

55 Ala Pro Val lie Asp Lys Lys Ser Ser Pro Leu Leu Thr Glu Leu Leu 
10 15 20 

Asp Leu Val Leu lie Gly Pro Asp Glu Glu Gly lie Gin Pro Gin Val 

25 30 35 

lie lie Val Ala Arg Lys Met Glu Tyr Thr Lys Trp Thr Gly Leu Ala 
60 4 0 4 5 50 

Cys Thr His Arg Asp 
55 
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<210> 374 
<211> 85 
<212> PRT 

<213> Homo sapiens 

5 

<220> 

<221> SIGNAL 
<222> -20 . . -1 

10 <400> 374 

Met Gly Pro Asn Thr Lys Asn Leu Leu Leu Val Thr Leu Val Ala Ser 
-20 -15 -10 -5 

Thr Val Pro Gly Asn Ser Leu Gly Gin Asp Phe Thr Phe Ala His Leu 
15 10 
15 Glu Arg Ser Cys Thr Arg Glu Asn Arg Ser Pro Gly Glu Val Phe Gin 
15 20 25 

Gin Pro Cys Lys Ser Gly Gly Gly Gly Val Gly Glu Pro Asn Ala Gin 

30 35 40 

Gly Gin Leu Leu Ser Gin His Pro Leu Pro Ala Phe He Asn Cys Ser 
20 45 50 55 60 

His Gly Gin Ala Phe 
65 

<210> 375 
25 <211> 90 
<212> PRT 

<213> Homo sapiens 

<220> 
30 <221> SIGNAL 
<222> -28 . . -1 

<400> 375 

Met Ala Phe Pro Gly Gin Ser Asp Thr Lys Met Gin Trp Pro Glu Val 
35 -25 -20 -15 

Pro Ala Leu Pro Leu Leu Ser Ser Leu Cys Met Ala Met Val Arg Lys 

-10 -5 1 

Ser Ser Ala Leu Gly Lys Glu Val Gly Arg Arg Val Lys Glu Met Val 
5 10 15 20 

40 Met Leu Val Ala Pro Phe Arg Gin Ser Ser Ser Leu Ser Arg Thr Phe 

25 30 35 

Ser Ser Arg Lys Val Val Lys Ala His Ala Ser Leu His Gly Ala Arg 

40 45 50 

Leu Ser Pro Leu Ser Arg Asn He Arg Gly 
45 55 60 

<210> 376 
<211> 89 
<212> PRT 
50 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -33 . . -1 



55 



60 



<220> 

<221> UNSURE 
<222> 47 

<223> Xaa = Ala, Pro, Ser , Thr 
<400> 376 

Met Ala Gin Pro Ala Ala Pro Ser Leu Thr Arg Pro Phe Leu Ala Glu 
-30 -25 -20 
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Ala Pro Thr Ala Leu Val Pro His Ser Pro Leu Pro Gly Ala Leu Ser 

-15 -10 -5 

Ser Ala Pro Gly Pro Lys Gin Pro Pro Thr Ala Ser Thr Gly Pro Glu 
15 10 15 

5 Leu Leu Leu Leu Pro Leu Ser Ser Phe Met Pro Cys Gly Ala Ala Ala 

20 25 30 

Pro Ala Arg Val Ser Ser Gin Arg Ala Thr Pro Arg Asp Lys Pro Xaa 

35 40 45 

Gly Pro Leu lie Pro Gly Gin Cys Pro 
10 50 55 

<210> 377 
<211> 132 
<212> PRT 
15 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -15 . . -1 

20 

<400> 377 

Met Asn Arg Val Leu Cys Ala Pro Ala Ala Gly Ala Val Arg Ala Leu 
-15 -10 -5 1 

Arg Leu lie Gly Trp Ala Ser Arg Ser Leu His Pro Leu Pro Gly Ser 
25 5 10 15 

Arg Asp Arg Ala His Pro Ala Ala Glu Glu Glu Asp Asp Pro Asp Arg 

20 25 30 

Pro lie Glu Phe Ser Ser Ser Lys Ala Asn Pro His Arg Trp Ser Val 
35 40 45 

30 Gly His Thr Met Gly Lys Gly His Gin Arg Pro Trp Trp Lys Val Leu 
50 55 60 65 

Pro Leu Ser Cys Phe Leu Val Ala Leu He He Trp Cys Tyr Leu Arg 

70 75 80 

Glu Glu Ser Glu Ala Asp Gin Trp Leu Arg Gin Val Trp Gly Glu Val 
35 85 90 95 

Pro Glu Pro Ser Asp Arg Ser Glu Glu Pro Glu Thr Pro Ala Ala Tyr 

100 105 110 

Arg Ala Arg Thr 
115 

40 

<210> 378 
<211> 102 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -14 . . -1 

50 <220> 

<221> UNSURE 
<222> 50 

<223> Xaa = Ala, Gly 

55 <220> 

<221> UNSURE 
<222> 51 

<223> Xaa = Leu, Met, Val 
60 <400> 378 

Met Phe Leu Thr Ala Leu Leu Trp Arg Gly Arg He Pro Gly Arg Gin 

-10 -5 1 

Trp He Gly Lys His Arg Arg Pro Arg Phe Val Ser Leu Arg Ala Lys 

315 
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5 10 15 

Gin Asn Met lie Arg Arg Leu Glu lie Asp Ala Glu Asn His Tyr Trp 

20 25 30 

Leu Ser Met Pro Tyr Met Thr Arg Glu Gin Glu Arg Gly His Ala Xaa 
5 35 40 45 50 

Xaa Arg Arg Arg Glu Ala Phe Glu Ala lie Lys Ala Ala Ala Thr Ser 

55 60 65 

Lys Phe Pro Pro His Arg Phe lie Ala Asp Gin Leu Asp His Leu Asn 
70 75 80 

10 Val Thr Lys Lys Trp Ser 
85 

<210> 379 
<211> 504 
15 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
20 <222> -24 . . -1 

<400> 379 

Met Gly He Lys Thr Ala Leu Pro Ala Ala Glu Leu Gly Leu Tyr Ser 
-20 -15 -10 

25 Leu Val Leu Ser Gly Ala Leu Ala Tyr Ala Gly Arg Gly Leu Leu Glu 
-5 15 
Ala Ser Gin Asp Gly Ala His Arg Lys Ala Phe Arg Glu Ser Val Arg 

10 15 20 

Pro Gly Trp Glu Tyr He Gly Arg Lys Met Asp Val Ala Asp Phe Glu 
30 25 30 35 40 

Trp Val Met Trp Phe Thr Ser Phe Arg Asn Val He He Phe Ala Leu 

4 5 50 55 

Ser Gly His Val Leu Phe Ala Lys Leu Cys Thr Met Val Ala Pro Lys 
60 65 70 

35 Leu Arg Ser Trp Met Tyr Ala Val Tyr Gly Ala Leu Ala Val Met Gly 
75 80 85 

Thr Met Gly Pro Trp Tyr Leu Leu Leu Leu Leu Gly His Cys Val Gly 

90 95 100 

Leu Tyr Val Ala Ser Leu Leu Gly Gin Pro Trp Leu Cys Leu Gly Leu 
40 105 110 115 120 

Gly Leu Ala Ser Leu Ala Ser Phe Lys Met Asp Pro Leu He Ser Trp 

125 130 135 

Gin Ser Gly Phe Val Thr Gly Thr Phe Asp Leu Gin Glu Val Leu Phe 
140 145 150 

45 His Gly Gly Ser Ser Phe Thr Val Leu Arg Cys Thr Ser Phe Ala Leu 
155 160 165 

Glu Ser Cys Ala His Pro Asp Arg His Tyr Ser Leu Ala Asp Leu Leu 

170 175 180 

Lys Tyr Ser Phe Tyr Leu Pro Phe Phe Phe Phe Gly Pro He Met Thr 
50 185 190 195 200 

Phe Asp Arg Phe His Ala Gin Val Ser Gin Val Glu Pro Val Arg Arg 

205 210 215 

Glu Gly Glu Leu Trp His lie Arg Ala Gin Ala Gly Leu Ser Val Val 
220 225 230 

55 Ala He Met Ala Val Asp He Phe Phe His Phe Phe Tyr He Leu Thr 
235 240 245 

He Pro Ser Asp Leu Lys Phe Ala Asn Arg Leu Pro Asp He Ala Leu 

250 255 260 

Ala Gly Leu Ala Tyr Ser Asn Leu Val Tyr Asp Trp Val Lys Ala Ala 
60 265 270 275 280 

Val Leu Phe Gly Val Val Asn Thr Val Ala Cys Leu Asp His Leu Asp 

285 290 295 

Pro Pro Gin Pro Pro Lys Cys He Thr Ala Leu Tyr Val Phe Ala Glu 
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300 










305 










310 








Thr 


His 


Phe 
315 


Asp 


Arg 


Gly 


He 


Asn 
320 


Asp 


Trp 


Leu 


Cys 


Lys 
325 


Tyr 


Val 


Tyr 




Asn 


His 


He 


Gly 


Gly Glu 


His 


Ser 


Ala 


Val 


He 


Pro 


Glu 


Leu 


Ala 


Ala 


5 




330 










335 










340 












Thr 


Val 


Ala 


Thr 


Phe 


Ala 


He 


Thr 


Thr 


Leu 


Trp 


Leu 


Gly 


Pro 


Cys 


Asp 




345 










350 










355 










360 




He 


Val 


Tyr 


Leu 


Trp 


Ser 


Phe 


Leu 


Asn 


Cys 


Phe 


Gly Leu Asn 


Phe 


Glu 












365 










370 










375 




10 


Leu 


Trp 


Met 


Gin 


Lys 


Leu 


Ala 


Glu 


Trp Gly 


Pro 


Leu 


Ala 


Arg 


He 


Glu 










380 










385 










390 








Ala 


Ser 


Leu 
395 


Ser 


Val 


Gin 


Met 


Ser 
400 


Arg 


Arg 


Val 


Arg 


Ala 
405 


Leu 


Phe 


Gly 




Ala 


Met 


Asn 


Phe 


Trp 


Ala 


He 


He 


Met 


Tyr 


Asn 


Leu 


Val 


Ser 


Leu 


Asn 


15 




410 










415 










420 












Ser 


Leu 


Lys 


Phe 


Thr 


Glu 


Leu 


Val 


Ala 


Arg 


Arg 


Leu 


Leu 


Leu 


Thr 


Gly 




425 










430 










435 










440 




Phe 


Pro 


Gin 


Thr 


Thr 
445 


Leu 


Ser 


He 


Leu 


Phe 
450 


Val 


Thr 


Tyr 


Cys 


Gly 
455 


Val 


20 


Gin 
Lys 


Leu 
Gin 


Val 

Asp 
475 


Lys 
460 
Lys 


Glu 
Glu 


Arg 
Lys 


Glu 
Pro 


Arg 

Glu 
480 


Thr 
465 


Leu 


Ala 


Leu 


Glu 


Glu 
470 


Glu 


Gin 



25 <210> 380 

<211> 152 

<212> PRT 

<213> Homo sapiens 

30 <220> 

<221> SIGNAL 
<222> -26 . . -1 

<400> 380 

35 Met Val Thr Phe Pro Asp Val Pro Leu Gly He Phe Leu Phe Cys Val 
-25 -20 -15 

Cys Val He Ala He Gly Val Val Gin Ala Leu He Val Gly Tyr Ala 
-10 -5 15 

Phe His Phe Pro His Leu Leu Ser Pro Gin He Gin Arg Ser Ala His 
40 10 15 20 

Arg Ala Leu Tyr Arg Arg His Val Leu Gly He Val Leu Gin Gly Pro 

25 30 35 

Ala Leu Cys Phe Ala Ala Ala He Phe Ser Leu Phe Phe Val Pro Leu 
40 45 50 

45 Ser Tyr Leu Leu Met Val Thr Val He Leu Leu Pro Tyr Val Ser Lys 
55 60 65 70 

Val Thr Gly Trp Cys Arg Asp Arg Leu Leu Gly His Arg Glu Pro Ser 

75 80 85 

Ala His Pro Val Glu Val Phe Ser Phe Asp Leu His Glu Pro Leu Ser 
50 90 95 100 

Lys Glu Arg Val Glu Ala Phe Ser Asp Gly Val Tyr Ala He Val Ala 

105 110 115 

Thr Leu Leu He Leu Asp He Trp 
120 125 

55 

<210> 381 

<211> 51 

<212> PRT 

<213> Homo sapiens 

60 

<220> 

<221> SIGNAL 
<222> -26. . -1 
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<400> 381 

Met Glu Met Leu Phe Asp Glu Arg Ala Pro Leu Leu Phe lie Leu Phe 
-25 -20 -15 

5 Lys Phe Ser Leu Cys Pro Tyr Ala Ala Ala Leu Ser Lys Pro lie Phe 
-10 -5 15 

Gly Ser Val Ala Cys Met Thr Lys Glu lie Leu Ala Arg His Gly Gly 
10 15 20 

Ser Arg Leu 
10 25 

<210> 382 

<211> 72 

<212> PRT 

15 <213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -23 . . -1 

20 

<400> 382 

Met Leu Arg Pro Ala Leu Pro Trp Leu Tyr Leu Gly Leu Cys Ser Leu 

-20 -15 -10 

Leu Val Gly Glu Ala Glu Ala Pro Ser Pro Val Asp Pro Leu Glu Arg 
25 -5 15 

Ser Arg Pro Tyr Ala Val Leu Arg Gly Gin Asn Leu Val Leu Met Gly 
10 15 20 25 

Thr lie Phe Ser lie Leu Leu Val Thr Val lie Leu Met Ala Phe Cys 
30 35 40 

30 Val Tyr Lys Pro lie Arg Arg Arg 
45 

<210> 383 
<211> 95 
35 <212> PRT 

<213> Homo sapiens 

<220> 

<221> SIGNAL 
40 <222> -48 . . -1 

<400> 383 

Met Ala Ser Ser His Trp Asn Glu Thr Thr Thr Ser Val Tyr Gin Tyr 
-45 -40 -35 

45 Leu Gly Phe Gin Val Gin Lys lie Tyr Pro Phe His Asp Asn Trp Asn 
-30 -25 -20 

Thr Ala Cys Phe Val lie Leu Leu Leu Phe lie Phe Thr Val Val Ser 

-15 -10 -5 

Leu Val Val Leu Ala Phe Leu Tyr Glu Val Leu Asp Cys Cys Cys Cys 
50 1 5 10 15 

Val Lys Asn Lys Thr Val Lys Asp Leu Lys Ser Glu Pro Asn Pro Leu 

20 25 30 

Arg Ser Met Met Asp Asn lie Arg Lys Arg Glu Thr Glu Val Val 
35 40 45 



55 



60 



<210> 384 
<211> 150 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> SIGNAL 
<222> -20 . . -1 
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<400> 384 
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Gly 
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Ser 
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Val 




-20 










-15 










- 10 










-5 
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Gly 


Ala 


Trp 


Leu 


Lys 


Leu 


Gly 


Asn 


Gly 


Gin 


Ala 


Thr 


Ser 


Met 


val 


Gin 












1 








5 










10 








Leu 
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Gly Gly Arg 


Phe 


Leu 


Met 


Gly 


Thr 


Asn 


Ser 


Pro 


Asp 


Ser 


Arg 








15 










20 










25 










Asp 


Gly 


Glu 


Gly 


Pro 


Val 


Arg 


Glu 


Ala 


Thr 


Val 


Lys 


Pro 


Phe 


Ala 


lie 


10 




30 










35 










40 












Asp 


lie 


Phe 


Pro 


Val 


Thr 


Asn 


Lys 


Asp 


Phe 


Arg 


Asp 


Phe 


Val 


Arg 


Glu 




45 










50 










55 










60 




Lys 


Lys 


Tyr 


Arg 


Thr 


Glu 


Ala 


Glu 


Met 


Phe 


Gly 


Trp 


Ser 


Phe 


Val 


Phe 












65 










70 










75 




15 


Glu 


Asp 


Phe 


Val 


Ser 


Asp 


Glu 


Leu 


Arg 


Asn 


Lys 


Ala 


Thr 


Gin 


Pro 


Met 










80 










85 










90 








Lys 


Val 


Lys 


Phe 


Thr 


His 


Gly 


Gly 


Thr 


Gly 


Ser 


Ser 


Gin 


Thr 


Ala 


Pro 








95 










100 










105 










Thr 


Cys 


Gly Arg 


Glu 


Ser 


Ser 


Pro 


Arg 


Glu 


Thr 


Lys 


Leu 


Arg 


Met 


Ala 


20 




110 










115 










120 












Ser 


Met 


Glu 


Ser 


Pro 


Gin 
























125 










130 
























<210> 385 




























25 


<211> 354 






























<212> PRT 






























<213> Homo sapiens 


























<400> 385 




























30 


Met 


Ser 


Ala 


Gly 


Gly 


Gly 


Arg 


Ala 


Phe 


Ala 


Trp 


Gin 


Val 


Phe 


Pro 


Pro 




1 
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10 










15 






Met 


Pro 


Thr 
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Arg 


Val 
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Gly 


Thr 


Val 


Ala 


His 


Gin 


Asp 


Gly 


His 










20 










25 










30 








Leu 


Leu 


Val 


Leu 


Gly 


Gly 


Cys 


Gly 


Arg 


Ala 


Gly 


Leu 


Pro 


Leu 


Asp 


Thr 


35 






35 










40 










45 










Ala 


Glu 


Thr 


Leu 


Asp 


Met 


Ala 


Ser 


His 


Thr 


Trp 


Leu 


Ala 


Leu 


Ala 


Pro 






50 










55 










60 












Leu 


Pro 


Thr 


Ala 


Arg 


Ala 


Gly 


Ala 


Ala 


Ala 


Val 


Val 


Leu 


Gly 


Lys 


Gin 




65 










70 










75 










80 


40 
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Leu 


Val 


Val 
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Glu 


Val 
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Ser 


Pro 


Val 
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Ala 












85 
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Glu 
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Phe 


Leu 


Met 


Asp 


Glu 


Gly 


Arg 
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Glu 


Arg 


Arg 


Ala 
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100 










105 
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Leu 
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Ala 


Ala 


Met 


Gly 


Val 


Ala 


Thr 


Val 


Glu 


Arg 


Asp 


Gly 


Met 


45 
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Gly 
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Met 
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Ala 


Pro 
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Ala 


Gin 
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Val 


Arg 
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Asp 


Pro 


Arg 


Arg 


Asp 
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Trp 


Leu 


Ser 
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Pro 


Ser 




145 
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50 


Met 


Pro 


Thr 


Pro 


Cys 
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Gly 
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Ser 


Thr 


Phe 


Leu 


His 


Gly 
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Lys 












165 
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Val 


Leu 


Gly 


Gly 


Arg 
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Gly 


Lys 


Leu 


Pro 


Val 


Thr 


Ala 


Phe 










180 










185 
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Glu 


Ala 


Phe 


Asp 


Leu 


Glu 


Ala 


Arg 


Thr 


Trp 


Thr 


Arg 


His 


Pro 


Ser 


Leu 


55 
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Ser 


Arg 


Arg 


Ala 


rile 


M. J. d 


\j .L y 




Al a 
/A x d 


Met 
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210 










215 
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Phe 


Ser 


Leu 


Gly 


Gly 


Leu 


Gin 
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Pro 
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Phe 
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Ser 




225 










230 
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240 


60 


Arg 


Pro 


His 


Phe 


Val 
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Thr 


Val 


Glu 


Met 
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Asp 


Leu 


Glu 


His 


Gly 












245 
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Ser 


Trp 


Thr 


Lys 


Leu 


Pro 


Arg 


Ser 


Leu 


Arg 


Met 


Arg 


Asp 


Lys 


Arg 


Ala 



260 265 270 
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Asp 


Phe 


Val 


Val 


Gly 


Ser 


Leu Gly Gly His 
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Val 
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He 


Gly 


Gly 








275 
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Gly Asn 
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Pro 


Leu Gly 
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Val 


Glu 


Ser 


Phe 


Ser 
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290 










2 95 








300 
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Ala 


Arg 


Arg 
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Trp 
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Ala 


Met 


Pro 


Thr 


Ala 


Arg 


Cys 




305 










310 








315 










320 




Ser 


Cys 


Ser 


Ser 


Leu 


Gin 


Ala 


Gly Pro 


Arg 


Leu 


Phe 


Val 


He 


Gly 


Gly 










325 








330 
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Val 


Ala 


Gin 


Gly 


Pro 


Ser 


Gin 


Ala val 


Glu 


Ala 


Leu 


Cys 


Leu 


Arg 


Asp 


10 








340 








345 










350 







Gly Val 



<210> 386 
<211> 207 
15 <212> PRT 

<213> Homo sapiens 



<400> 386 

Met Ala Leu Leu Phe Ala Arg Ser Leu Arg Leu Cys Arg Trp Gly Ala 
20 1 5 10 15 

Lys Arg Leu Gly Val Ala Ser Thr Glu Ala Gin Arg Gly Val Ser Phe 

20 25 30 

Lys Leu Glu Glu Lys Thr Ala His Ser Ser Leu Ala Leu Phe Arg Asp 
3 5 4 0 4 5 

25 Asp Thr Gly Val Lys Tyr Gly Leu Val Gly Leu Glu Pro Thr Lys Val 
50 55 60 

Ala Leu Asn Val Glu Arg Phe Arg Glu Trp Ala Val Val Leu Ala Asp 
65 70 75 80 

Thr Ala Val Thr Ser Gly Arg His Tyr Trp Glu Val Thr Val Lys Arg 
30 85 90 95 

Ser Gin Gin Phe Arg He Gly Val Ala Asp Val Asp Met Ser Arg Asp 

100 105 110 

Ser Cys He Gly Val Asp Asp Arg Ser Trp Val Phe Thr Tyr Ala Gin 
115 120 125 

35 Arg Lys Trp Tyr Thr Met Leu Ala Asn Glu Lys Ala Pro Val Glu Gly 
130 135 140 

He Gly Gin Pro Glu Lys Val Gly Leu Leu Leu Glu Tyr Glu Ala Gin 
145 150 155 160 

Lys Leu Ser Leu Val Asp Val Ser Gin Val Ser Val Val His Thr Leu 
40 165 170 175 

Gin Thr Asp Phe Arg Gly Pro Val Val Pro Ala Phe Ala Leu Trp Asp 

180 185 190 

Gly Glu Leu Leu Thr His Ser Gly Leu Glu Val Pro Glu Gly Leu 
195 200 205 

45 

<210> 387 
<211> 210 
<212> PRT 

<213> Homo sapiens 

50 

<400> 387 

Met Ala Ala Ser Val Glu Gin Arg Glu Gly Thr He Gin Val Gin Gly 
15 10 15 

Gin Ala Leu Phe Phe Arg Glu Ala Leu Pro Gly Ser Gly Gin Ala Arg 
55 20 25 30 

Phe Ser Val Leu Leu Leu His Gly He Arg Phe Ser Ser Glu Thr Trp 

35 40 45 

Gin Asn Leu Gly Thr Leu His Arg Leu Ala Gin Ala Gly Tyr Arg Ala 
50 55 60 

60 Val Ala He Asp Leu Pro Gly Leu Gly His Ser Lys Glu Ala Ala Ala 
65 70 75 80 

Pro Ala Pro He Gly Glu Leu Ala Pro Gly Ser Phe Leu Ala Ala Val 
85 90 95 
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<210> 388 
<211> 375 
20 <212> PRT 

<213> Homo sapiens 



<400> 388 
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Gly 


Thr 


Leu 


Gly 


Trp 


Asp 


Lys 


Leu 


Val 


Tyr 








115 










120 










125 








40 


Ala 


Asp 


Thr 


Cys 


Phe 


Ser 


Thr 


He 


Lys 


Leu 


Lys 


Ala 


Glu 


Asp 


Ala 


Ser 






130 










135 










140 












Gly 


Arg 


Glu 


His 


Leu 


He 


Thr 


Leu 


Lys 


Leu 


Lys 


Ala 


Lys 


Tyr 


Pro 


Ala 




145 










150 










155 










160 




Glu 


Ser 


Pro 


Asp 


Tyr 


Phe 


Val 


Asp 


Phe 


Pro 


Val 


Pro 


Phe 


Cys 


Ala 


Ser 


45 










165 










170 










175 






Trp 


Thr 


Pro 


Gin 


Ser 


Ser 


Leu 


He 


Ser 


He 


Tyr 


Ser 


Gin 


Phe 


Leu 


Ala 










180 










185 










190 








Ala 


He 


Glu 


Ser 


Leu 


Lys 


Ala 


Phe 


Trp 


Asp 


Val 


Met 


Asp 


Glu 


He 


Asp 








195 










200 










205 








50 


Glu 


Lys 


Thr 


Trp 


Val 


Leu 


Glu 


Pro 


Glu 


Lys 


Pro 


Pro 


Arg 


Ser 


Ala 


Thr 






210 










215 










220 












Ala 


Arg 


Arg 


He 


Ala 


Leu 


Gly Asn 


Asn 


Val 


Ser 


He 


Asn 


He 


Glu 


Val 




225 










230 










235 










240 




Asp 


Pro 


Arg 


His 


Pro 


Thr 


Met 


Leu 


Pro 


Glu 


Cys 


Phe 


Phe 


Leu 


Gly 


Ala 


55 










245 










250 










255 






Asp 


His 


Val 


Val 


Lys 


Pro 


Leu 


Gly 


He 


Lys 


Leu 


Ser 


Arg 


Asn 


He 


His 










260 










265 










270 








Leu 


Trp 


Asp 


Pro 


Glu 


Asn 


Ser 


Val 


Leu 


Gin 


Asn 


Leu 


Lys 


Asp 


Val 


Leu 








275 










280 










285 








60 


Glu 


He 


Asp 


Phe 


Pro 


Ala 


Arg 


Ala 


He 


Leu 


Glu 


Lys 


Ser 


Asp 


Phe 


Thr 






290 










295 










300 












Met 


Asp 


Cys 


Gly 


He 


Cys 


Tyr 


Ala 


Tyr 


Gin 


Leu 


Asp Gly 


Thr 


He 


Pro 




305 










310 










315 










320 



321 
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Asp Gin Val Cys Asp Asn Ser Gin Cys Gly Gin Pro Phe His Gin lie 

325 330 335 

Cys Leu Tyr Glu Trp Leu Arg Gly Leu Leu Thr Ser Arg Gin Ser Phe 
340 345 350 

5 Asn lie lie Phe Gly Glu Cys Pro Tyr Cys Ser Lys Pro lie Thr Leu 
355 360 365 

Lys Met Ser Gly Arg Lys His 
370 375 

10 <210> 389 
<211> 509 
<212> PRT 

<213> Homo sapiens 
15 <400> 389 

Met Ala Ala lie Gly Val His Leu Gly Cys Thr Ser Ala Cys Val Ala 
15 10 15 

Val Tyr Lys Asp Gly Arg Ala Gly Val Val Ala Asn Asp Ala Gly Asp 
20 25 30 

20 Arg Val Thr Pro Ala Val Val Ala Tyr Ser Glu Asn Glu Glu lie Val 
35 40 45 

Gly Leu Ala Ala Lys Gin Ser Arg lie Arg Asn lie Ser Asn Thr Val 

50 55 60 

Met Lys Val Lys Gin lie Leu Gly Arg Ser Ser Ser Asp Pro Gin Ala 
25 65 70 75 80 

Gin Lys Tyr lie Ala Glu Ser Lys Cys Leu Val lie Glu Lys Asn Gly 

85 90 95 

Lys Leu Arg Tyr Glu lie Asp Thr Gly Glu Glu Thr Lys Phe Val Asn 
100 105 110 

30 Pro Glu Asp Val Ala Arg Leu lie Phe Ser Lys Met Lys Glu Thr Ala 
115 120 125 

His Ser Val Leu Gly Ser Asp Ala Asn Asp Val Val lie Thr Val Pro 

130 135 140 

Phe Asp Phe Gly Glu Lys Gin Lys Asn Ala Leu Gly Glu Ala Ala Arg 
35 145 150 155 160 

Ala Ala Gly Phe Asn Val Leu Arg Leu lie His Glu Pro Ser Ala Ala 

165 170 175 

Leu Leu Ala Tyr Gly lie Gly Gin Asp Ser Pro Thr Gly Lys Ser Asn 
180 185 190 

40 lie Leu Val Phe Lys Leu Gly Gly Thr Ser Leu Ser Leu Ser Val Met 
195 200 205 

Glu Val Asn Ser Gly lie Tyr Arg Val Leu Ser Thr Asn Thr Asp Asp 

210 215 220 

Asn lie Gly Gly Ala His Phe Thr Glu Thr Leu Ala Gin Tyr Leu Ala 
45 225 230 235 240 

Ser Glu Phe Gin Arg Ser Phe Lys His Asp Val Arg Gly Asn Ala Arg 

245 250 255 

Ala Met Met Lys Leu Thr Asn Ser Ala Glu Val Ala Lys His Ser Leu 
260 265 270 

50 Ser Thr Leu Gly Ser Ala Asn Cys Phe Leu Asp Ser Leu Tyr Glu Gly 
275 280 285 

Gin Asp Phe Asp Cys Asn Val Ser Arg Ala Arg Phe Glu Leu Leu Cys 

290 295 300 

Ser Pro Leu Phe Asn Lys Cys lie Glu Ala lie Arg Gly Leu Leu Asp 
55 305 310 315 320 

Gin Asn Gly Phe Thr Thr Asp Asp lie Asn Lys Val Val Leu Cys Gly 

325 330 335 

Gly Ser Ser Arg lie Pro Lys Leu Gin Gin Leu lie Lys Asp Leu Phe 
340 345 350 

60 Pro Ala Val Glu Leu Leu Asn Ser lie Pro Pro Asp Glu Val lie Pro 
355 360 365 

He Gly Ala Ala He Glu Ala Gly He Leu He Gly Lys Glu Asn Leu 
370 375 380 
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Leu 


Val 


Glu 


Asp 


Ser 


Leu 


Met 


He 


Glu 


Cys 


Ser 


Ala 


Arg 


Asp 


He 


Leu 




385 










390 










395 










400 




Val 


Lys 


Gly 


Val 


Asp 
405 


Glu 


Ser 


Gly 


Ala 


Ser 
410 


Arg 


Phe 


Thr 


Val 


Leu 
415 


Phe 


5 


Pro 


Ser 


Gly 


Thr 
420 


Pro 


Leu 


Pro 


Ala 


Arg 
425 


Arg 


Gin 


His 


Thr 


Leu 
430 


Gin 


Ala 




Pro 


Gly 


Ser 
435 


He 


Ser 


Ser 


Val 


Cys 
440 


Leu 


Glu 


Leu 


Tyr 


Glu 
445 


Ser 


Asp 


Gly 




Lys 


Asn 


Ser 


Ala 


Lys 


Glu 


Glu 


Thr 


Lys 


Phe 


Ala 


Gin 


Val 


Val 


Leu 


Gin 


10 




450 










455 










460 












Asp 


Leu 


Asp 


Lys 


Lys 


Glu 


Asn 


Gly 


Leu 


Arg 


Asp 


He 


Leu 


Ala 


Val 


Leu 




465 










470 










475 










480 




Thr 


Met 


Lys 


Arg 


Asp 
485 


Gly 


Ser 


Leu 


His 


Val 
490 


Thr 


Cys 


Thr 


Asp 


Gin 
495 


Glu 


15 


Thr 


Gly 


Lys 


Cys 
500 


Glu 


Ala 


He 


Ser 


He 
505 


Glu 


He 


Ala 


Ser 
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<211> 78 
20 <212> PRT 

<213> Homo sapiens 





Met 


Tyr 


Asn 


Thr 


Gly 


Arg 


His 


Val 


Ser 


Leu 


Arg 


Leu 


Asp 


Lys 


Glu 


His 


25 


1 








5 










10 










15 






Leu 


Val 


Asn 


He 
20 


Ser 


Gly 


Gly 


Pro 


Met 
25 


Thr 


Tyr 


Ser 


His 


Arg 
30 


Leu 


Glu 




Glu 


He 


Arg 
35 


Leu 


His 


Phe 


Gly 


Ser 
40 


Glu 


Asp 


Ser 


Gin 


Gly 
45 


Ser 


Glu 


His 


30 


Leu 


Leu 


Asn Gly Gin 


Ala 


Phe 


Ser 


Gly 


Glu 


Leu 


Gin 


Glu 


Arg 


Asp 


Leu 






50 










55 










60 












Phe 


He 


Leu 


Leu 


Thr 


Ser 


Val 


Ser 


Gly 


His 


Leu 


Pro 


Asp 


Thr 







65 70 75 



35 <210> 391 
<211> 162 
<212> PRT 

<213> Homo sapiens 
40 <400> 391 

Met Ala Thr His Ala Leu Glu He Ala Gly Leu Phe Leu Gly Gly Val 
15 10 15 

Gly Met Val Gly Thr Val Ala Val Thr Val Met Pro Gin Trp He Val 
20 25 30 

45 Ser Ala Phe He Glu Asn Asn He Val Val Phe Glu Asn Phe Trp Glu 
35 40 45 

Gly Leu Trp Met Asn Cys Val Arg Gin Ala Asn He Arg Met Gin Cys 

50 55 60 

Lys He Tyr Asp Ser Leu Leu Ala Leu Ser Pro Asp Leu Gin Ala Ala 
50 65 70 75 80 

Arg Gly Leu Met Cys Ala Ala Ser Val Met Ser Phe Leu Ala Phe Met 

85 90 95 

Met Ala He Leu Gly Met Lys Cys Thr Arg Cys Thr Gly Asp Asn Glu 
100 105 110 

55 Lys Val Lys Ala His He Leu Leu Thr Ala Gly He He Phe He He 
115 120 125 

Thr Gly Met Val Val Leu He Pro Val Ser Trp Val Ala Asn Ala He 

130 135 140 

He Arg Asp Phe Tyr Asn Pro He Val Asn Val Ala Gin Lys Arg Glu 
60 145 150 155 160 

Leu Gly 



<210> 392 
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<211> 146 
<212> PRT 

<213> Homo sapiens 



5 <400> 392 



Met 


Asn 


Ser 


Leu 


Leu 


His 


Phe 


Gly 


He 


Leu 


Leu 


Glu 


Leu 


Ser 


Leu 


Leu 


1 








5 










10 










15 




Lys 


Gin 


Phe 


Lys 
20 


Ser 


Val 


Tyr 


Val 


Pro 
25 


Gly 


Asn 


His 


Thr 


His 
30 


Gin 


Ala 


Ser 


Tyr 


Lys 
35 


Pro 


Leu 


Leu 


Lys 


Gin 
40 


Val 


Val 


Glu 


Glu 


He 
45 


Phe 


His 


Pro 


Glu 


Arg 
50 


Pro 


Asp 


Ser 


Val 


Asp 
55 


He 


Glu 


His 


Met 


Ser 
60 


Ser 


Gly 


Leu 


Thr 


Asp 


Leu 


Leu 


Lys 


Thr 


Gly 


Phe 


Ser 


Met 


Phe 


Met 


Lys 


Val 


Ser 


Arg 


Pro 


65 










70 










75 










80 


His 


Pro 


Ser 


Asp 


Tyr 
85 


Pro 


Leu 


Leu 


He 


Leu 
90 


Phe 


Val 


Val 


Gly 


Gly 
95 


Val 


Thr 


Val 


Ser 


Glu 
100 


Val 


Lys 


Met 


Val 


Lys 
105 


Asp 


Leu 


Val 


Ala 


Ser 
110 


Leu 


Lys 


Pro 


Gly 


Thr 
115 


Gin 


Val 


He 


Val 


Leu 
120 


Ser 


Thr 


Arg 


Leu 


Leu 
125 


Lys 


Pro 


Leu 


Asn 


He 
130 


Pro 


Glu 


Leu 


Leu 


Phe 
135 


Ala 


Thr 


Asp 


Arg 


Leu 
140 


His 


Pro 


Asp 


Leu 



Gly Phe 
25 145 



<210> 393 

<211> 225 

<212> PRT 

30 <213> Homo sapiens 

<400> 393 





Met 


Ala 


Thr 


His 


Ala 


Leu 


Glu 


He 


Ala 


Gly 


Leu 


Phe 


Leu 


Gly 


Gly 


Val 




l 








5 










10 










15 




35 


Gly 


Met 


Val 


Gly 


Thr 


Val 


Ala 


Val 


Thr 


Val 


Met 


Pro 


Gin 


Trp 


Arg 


Val 










20 










25 










30 








Ser 


Ala 


Phe 


He 


Glu 


Asn 


Asn 


He 


Val 


Val 


Phe 


Glu 


Asn 


Phe 


Trp 


Glu 








35 










40 










45 










Gly 


Leu 


Trp 


Met 


Asn 


Cys 


Val 


Arg 


Gin 


Ala 


Asn 


He 


Arg 


Met 


Gin 


Cys 


40 




50 










55 










60 












Lys 


He 


Tyr 


Asp 


Ser 


Leu 


Leu 


Ala 


Leu 


Ser 


Pro 


Asp 


Leu 


Gin 


Ala 


Ala 




65 










70 










75 










80 




Arg 


Gly 


Leu 


Met 


Cys 


Ala 


Ala 


Ser 


Val 


Met 


Ser 


Phe 


Leu 


Ala 


Phe 


Met 












85 










90 










95 




45 


Met 


Ala 


He 


Leu 


Gly 


Met 


Lys 


Cys 


Thr 


Arg 


Cys 


Thr 


Gly 


Asp 


Asn 


Glu 










100 










105 










110 








Lys 


Val 


Lys 


Ala 


His 


He 


Leu 


Leu 


Thr 


Ala 


Gly 


He 


He 


Phe 


He 


He 








115 










120 










125 










Ala 


Gly 


Met 


Val 


Val 


Leu 


He 


Pro 


Val 


Ser 


Trp 


Val 


Ala 


Asn 


Ala 


He 


50 




130 










135 










140 












He 


Arg 


Asp 


Phe 


Tyr 


Asn 


Pro 


He 


Val 


Asn 


Val 


Ala 


Gin 


Lys 


Arg 


Glu 




145 










150 










155 










160 




Leu 


Gly 


Glu 


Ala 


Leu 


Tyr 


Leu 


Gly 


Trp 


Thr 


Thr 


Ala 


Leu 


Val 


Leu 


He 












165 










170 










175 




55 


Val 


Gly 


Gly 


Ala 


Leu 


Phe 


Cys 


Cys 


Val 


Phe 


Cys 


Cys 


Asn 


Glu 


Lys 


Ser 










180 










185 










190 








Ser 


Ser 


Tyr 


Arg 


Tyr 


Ser 


He 


Pro 


Ser 


His 


Arg 


Thr 


Thr 


Gin 


Lys 


Ser 








195 










200 










205 










Tyr 


His 


Thr 


Gly 


Lys 


Lys 


Ser 


Pro 


Ser 


Val 


Tyr 


Ser 


Arg 


Ser 


Gin 


Tyr 


60 




210 










215 










220 











val 



225 
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<210> 394 
<211> 114 
<212> PRT 

<213> Homo sapiens 
<400> 394 



Met 


Arg 


Leu 


Gin 


Asp 


Arg 


lie Ala 


Thr 


Phe 


Phe 


Phe 


Pro 


Lys 


Gly 


Met 


1 








5 








10 










15 




Met 


Leu 


Thr 


Thr 


Ala 


Ala 


Leu Met 


Leu 


Phe 


Phe 


Leu 


His 


Leu 


Gly 


He 








20 








25 










30 






Phe 


lie 


Arg 


Asp 


Val 


His 


Asn Phe 


Cys 


He 


Thr 


Tyr 


His 


Tyr 


Asp 


His 






35 








40 










45 








Met 


Ser 


Phe 


His 


Tyr 


Thr 


Val Val 


Leu 


Met 


Phe 


Ser 


Gin 


Val 


He 


Ser 




50 










55 








60 










lie 


Cys 


Trp 


Ala 


Ala 


Met 


Gly Ser 


Leu 


Tyr 


Ala 


Glu 


Met 


Thr 


Glu 


Asn 


65 










70 








75 










80 


Asn 


Ala 


Gin 


Arg 


Ser 


His 


Val Leu 


Gin 


Pro 


Pro 


Val 


Leu 


Gly 


Val 


Ser 










85 








90 










95 




Gly 


His 


Arg 


Val 


Pro 


Gly Gly Ala 


Pro 


Leu 


Arg 


Pro 


Gly 


Glu 


Ser 


Glu 








100 








105 










110 







Gin Gly 
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<400> 395 



30 



40 



50 



60 



Met 


Ala 


Thr 


Pro 


Asn 


Asn 


Leu 


Thr 


Pro 


Thr 


Asn 


Cys 


Ser 


Trp 


Trp 


Pro 


1 








5 










10 










15 




He 


Ser 


Ala 


Leu 


Glu 


Ser 


Asp 


Ala 


Ala 


Lys 


Pro 


Ala 


Glu 


Ala 


Pro 


Asp 








20 










25 










30 






Ala 


Pro 


Glu 


Ala 


Ala 


Ser 


Pro 


Ala 


His 


Trp 


Pro 


Arg 


Glu 


Ser 


Leu 


Val 






35 










40 










45 








Leu 


Tyr 


His 


Trp 


Thr 


Gin 


Ser 


Phe 


Ser 


Ser 


Gin 


Lys 


Val 


Arg 


Leu 


Val 




50 










55 










60 










He 


Ala 


Glu 


Lys 


Gly 


Leu 


Val 


Cys 


Glu 


Glu 


Arg 


Asp 


Val 


Ser 


Leu 


Pro 


65 










70 










75 










80 


Gin 


Ser 


Glu 


His 


Lys 


Glu 


Pro 


Trp 


Phe 


Met 


Arg 


Leu 


Asn 


Leu 


Gly 


Glu 










85 










90 










95 




Glu 


Val 


Pro 


Val 


He 


He 


His 


Arg 


Asp 


Asn 


He 


He 


Ser 


Asp 


Tyr 


Asp 








100 










105 










110 






Gin 


He 


He 


Asp 


Tyr 


Val 


Glu 


Arg 


Thr 


Phe 


Thr 


Gly 


Glu 


His 


Val 


Val 






115 










120 










125 








Ala 


Leu 


Met 


Pro 


Glu 


Val 


Gly 


Ser 


Leu 


Gin 


His 


Ala 


Arg 


Val 


Leu 


Gin 




130 










135 










140 










Tyr 


Arg 


Glu 


Leu 


Leu 


Asp 


Ala 


Leu 


Pro 


Met 


Asp 


Ala 


Tyr 


Thr 


His 


Gly 


145 










150 










155 










160 


Cys 


He 


Leu 


His 


Pro 


Glu 


Leu 


Thr 


Thr 


Asp 


Ser 


Met 


He 


Pro 


Lys 


Tyr 










165 










170 










175 




Ala 


Thr 


Ala 


Glu 


He 


Arg 


Arg 


His 


Leu 


Ala 


Asn 


Ala 


Thr 


Thr 


Asp 


Leu 








180 










185 










190 






Met 


Lys 


Leu 


Asp 


His 


Glu 


Glu 


Glu 


Pro 


Gin 


Leu 


Ser 


Glu 


Pro 


Tyr 


Leu 






195 










200 










205 








Ser 


Lys 


Gin 


Lys 


Lys 


Leu 


Met 


Val 


Lys 


He 


Leu 


Glu 


His 


Asp 


Asp 


Val 




210 










215 










220 










Ser 


Tyr 


Leu 


Lys 


Lys 


He 


Leu Gly Glu 


Leu 


Ala 


Met 


Val 


Leu 


Asp 


Gin 


225 










230 










235 










240 


He 


Glu 


Ala 


Glu 


Leu 


Glu 


Lys 


Arg 


Lys 


Leu 


Glu 


Asn 


Glu 


Gly 


Gin 


Lys 










245 










250 










255 




Cys 


Glu 


Leu 


Trp 


Leu 


Cys 


Gly 


Cys 


Ala 


Phe 


Thr 


Leu 


Ala 


Asp 


Val 


Leu 








260 










265 










270 






Leu 


Gly 
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Thr 


Leu 


His 


Arg 


Leu 


Lys 


Phe 


Leu 


Gly 


Leu 


Ser 


Lys 


Lys 
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275 










280 


Tyr 


Trp 


Glu 


Asp 


Gly 


Ser 


Arg 


Pro 




290 










295 




Val 


Gin 


Arg 


Arg 


Phe 


Ala 


Phe 


Arg 


305 










310 






Thr 


Leu 


Leu 


Ser 


Ala 


Val 


He 


Pro 










325 








Lys 


Pro 


Pro 


Ser 


Phe 


Phe 


Gly Ala 








340 










Gly Met 


Gly 


Tyr 


Phe 


Ala 


Tyr 


Trp 






355 










360 



285 



Asn 


Leu 


Gin 


Ser 


Phe 


Phe 


Glu 


Arg 








300 










Lys 


Val 


Leu 


Gly Asp 


He 


His 


Thr 






315 










320 


Asn 


Ala 


Phe 


Arg 


Leu 


Val 


Lys 


Arg 




330 










335 




Ser 


Phe 


Leu 


Met 


Gly 
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Leu 


Gly 


345 










350 






Tyr 


Leu 
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Tyr 


He 
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Arg 


Ala 


Ara 
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Cys 
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1 
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Met 


Pro 


Val 


Cvs 


Ala 


Pro 


Val 


Pro 


Trp 


Arg 


Ala 


Arg 


Arg 


Leu 


Cys 


Thr 








20 










25 
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Ala 


Val 


Val 
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Pro 


Ser 
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Val 


Pro 


Phe 


He 


Ala 


Gly 


Gin 


Gly 






35 










40 










45 








Cy s 
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Met 
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Pro 


Ala 


Thr 
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Pro 


Arg 
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Thr 


Arg 


Ser 




50 










55 










60 










Pro 


Leu 


Ala 


Glv 


Glv 


Val 


He 


Leu 


Gly 


Val 


Ala 


Leu 


Trp 


Leu 


Arg 


His 


65 










70 










75 










80 


Asp 


Pro 


Gin 


Thr 


Thr 


Asn 


Leu 


Leu 


Tyr 


Leu 


Glu 


Leu 


Gly 


Asp 


Lys 


Pro 










85 










90 










95 




Ala 


Pro 


Asn 


Thr 


Phe 


Tyr 


Val 


Gly 


He 


Tyr 


He 


Leu 


He 


Ala 


Val 


Gly 








100 










105 










110 






Ala 


Val 


Met 


Met 


Phe 


Val 


Gly Phe 


Leu Gly 


Cys 


Tyr 


Gly 


Ala 


He 


Gin 






115 










120 










125 








Glu 


Ser 


Gin 


Cys 


Leu 


Leu 


Gly 


Thr 


Phe 


Phe 


Thr 


Cys 


Leu 


Val 


He 


Leu 




130 










135 










140 










Phe 


Ala 


Cys 


Glu 


Val 


Ala 


Ala 


Gly 


He 


Trp 


Gly 


Phe 


Val 


Asn 


Lys 


Asp 


145 










150 










155 










160 


Gin 


He 


Ala 


Lys 
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Val 


Lys 


Gin 


Phe 


Tyr 


Asp 


Gin 


Ala 


Leu 


Gin 


Gin 










165 










170 
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Ala 


Val 


Val 


Asp 


Asp 


Asp 


Ala 


Asn 


Asn 


Ala 


Lys 


Ala 


Val 


Val 


Lys 


Thr 








180 










185 










190 






Phe 


His 


Glu 


Thr 


Leu 


Asp 


Cys 


Cys 


Gly 


Ser 


Ser 


Thr 


Leu 


Thr 


Ala 


Leu 






195 










200 










205 








Thr 


Thr 


Ser 


Val 


Leu 


Lys 


Asn 


Asn 


Leu 


Cys 


Pro 


Ser 


Gly 


Ser 


Asn 


He 




210 










215 










220 










He 


Ser 


Asn 


Leu 


Phe 


Lys 


Glu 


Asp 


Cys 


His 


Gin 


Lys 


He 


Asp 


Asp 


Leu 


225 










230 










235 










240 


Phe 


Ser 


Gly 


Lys 


Leu 


Tyr 


Leu 


He 


Gly 


He 


Ala 


Ala 


He 


Val 


Val 


Ala 










245 










250 










255 




Val 


He 


Met 


He 


Phe 


Glu 


Met 


He 


Leu 


Ser 


Met 


Val 


Leu 


Cys 


Cys 


Gly 








260 










265 










270 






He 


Arg 


Asn 


Ser 


Ser 


Val 


Tyr 





















275 



55 

<210> 397 
<211> 173 
<212> PRT 

<213> Homo sapiens 

60 

<400> 397 

Met Cys Leu Leu Leu Gly Ala Thr Gly Val Gly Lys Thr Leu Leu Val 
15 10 15 
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Lys 


Arg 


Leu 


Gin 


Glu 


Val 


Ser 


Ser 


Arg 


Asp Gly Lys 


Gly 


Asp 


Leu 


Gly 










20 










25 










30 








Glu 


Pro 


Pro 


Pro 


Thr 


Arg 


Pro 


Thr 


Val 


Gly Thr Asn 


Leu 


Thr 


Asp 


He 








35 










40 










45 








5 


Val 


Ala 


Gin Arg 


Lys 


He 


Thr 


He 


Arg 


Glu 


Leu Gly 


Gly 


Cys 


Met 


Gly 






50 










55 










60 












Pro 


He 


Trp 


Ser 


Ser 


Tyr 


Tyr 


Gly 


Asn 


Cys 


Arg 


Ser 


Leu 


Leu 


Phe 


Val 




65 










70 










75 










80 




Met 


Asp 


Ala 


Ser 


Asp 


Pro 


Thr 


Gin 


Leu 


Ser 


Ala 


Ser 


Cys 


Val 


Gin 


Leu 


10 










85 










90 










95 






Leu 


Gly 


Leu 


Leu 


Ser 


Ala 


Glu 


Gin 


Leu 


Ala 


Glu 


Ala 


Ser 


Val 


Leu 


He 










100 










105 










110 








Leu 


Phe 


Asn 


Lys 


He 


Asp 


Leu 


Pro 


Cys 


Tyr 


Met 


Ser 


Thr 


Glu 


Glu 


Met 








115 










120 










125 








15 


Lys 


Ser 


Leu 


He 


Arg 


Leu 


Pro 


Asp 


He 


He 


Ala 


Cys 


Ala 


Lys 


Gin 


Asn 






130 










135 










140 












He 


Thr 


Thr 


Ala 


Glu 


He 


Ser 


Ala 


Arg 


Glu 


Gly 


Thr 


Gly 


Leu 


Ala 


Gly 




145 










150 










155 










160 




Val 


Leu 


Ala 


Trp 


Leu 


Gin 


Ala 


Thr 


His 


Arg 


Ala 


Asn 


Asp 








20 










165 










170 















<210> 398 
<211> 205 
<212> PRT 
25 <213> Homo sapiens 



<400> 398 



35 



45 



Met 


Ala 


Ala 


Ala 


Arg 


Pro 


Ser 


Leu 


Gly Arg 


Val 


Leu 


Pro 


Gly 


Ser 


Ser 


1 








5 










10 










15 




Val 


Leu 


Phe 


Leu 
20 


Cys 


Asp 


Met 


Gin 


Glu 
25 


Lys 


Phe 


Arg 


His 


Asn 
30 


He 


Ala 


Tyr 


Phe 


Pro 
35 


Gin 


He 


Val 


Ser 


Val 
40 


Ala 


Ala 


Arg 


Met 


Leu 
45 


Lys 


Val 


Ala 


Arg 


Leu 
50 


Leu 


Glu 


Val 


Pro 


Val 
55 


Met 


Leu 


Thr 


Glu 


Gin 
60 


Tyr 


Pro 


Gin 


Gly 


Leu 


Gly 


Pro 


Thr 


Val 


Pro 


Glu 


Leu 


Gly 


Thr 


Glu 


Gly 


Leu 


Arg 


Pro 


Leu 


65 










70 










75 










80 


Ala 


Lys 


Thr 


Cys 


Phe 
85 


Ser 


Met 


Val 


Pro 


Ala 
90 


Leu 


Gin 


Gin 


Glu 


Leu 
95 


Asp 


Ser 


Arg 


Pro 


Gin 
100 


Leu 


Arg 


Ser 


Val 


Leu 
105 


Leu 


Cys 


Gly 


He 


Glu 
110 


Ala 


Gin 


Ala 


Cys 


He 
115 


Leu 


Asn 


Thr 


Thr 


Leu 
120 


Asp 


Leu 


Leu 


Asp 


Arg 
125 


Gly 


Leu 


Gin 


Val 


His 
130 


Val 


Val 


Val 


Asp 


Ala 
135 


Cys 


Ser 


Ser 


Arg 


Ser 
140 


Gin 


Val 


Asp 


Arg 


Leu 


Val 


Ala 


Leu 


Ala 


Arg 


Met 


Arg 


Gin 


Ser 


Gly Ala 


Phe 


Leu 


Ser 


Thr 


145 










150 










155 










160 


Ser 


Glu 


Gly 


Leu 


He 
165 


Leu 


Gin 


Leu 


Val 


Gly 
170 


Asp 


Ala 


Val 


His 


Pro 
175 


Gin 


Phe 


Lys 


Glu 


He 
180 


Gin 


Lys 


Leu 


He 


Lys 
185 


Glu 


Pro 


Ala 


Pro 


Asp 
190 


Ser 


Gly 


Leu 


Leu 


Gly 
195 


Leu 


Phe 


Gin 


Gly 


Gin 
200 


Asn 


Ser 


Leu 


Leu 


His 
205 









55 <210> 399 
<211> 180 
<212> PRT 
<2 13> Homo sapiens 

60 <400> 399 

Met Trp Leu Tyr Arg Asn Pro Tyr Val Glu Ala Glu Tyr Phe Pro Thr 

15 10 15 

Lys Pro Met Phe Val He Ala Phe Leu Ser Pro Leu Ser Leu He Phe 
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20 

Leu Ala Lys Phe Leu 
35 

Cys Leu Ala Ala Ser 
5 50 

Thr lie Lys Leu lie 
65 

Cys Phe Pro Asp Gly 
85 

10 Lys Asp Val Val Asn 
100 

Ser Phe Ala Phe Ala 
115 

Lys Leu His Cys Phe 
15 130 

Cys Ala Phe Leu Ser 
145 

Arg Thr Cys Asp Tyr 
165 

20 Asn Thr Ala Lys 
180 



25 

Lys Lys Ala Asp Thr 
40 

Leu Ala Leu Ala Leu 
55 

Val Gly Arg Pro Arg 
70 

Leu Ala His Ser Asp 
90 

Glu Gly Arg Lys Ser 
105 

Gly Leu Ala Phe Ala 
120 

Thr Pro Gin Gly Arg 
135 

Pro Leu Leu Phe Ala 
150 

Lys His His Trp Gin 
170 



30 



Arg 


Asp 


Ser 
45 


Arg 


Gin 


Ala 


Asn Gly Val 


Phe 


Thr 


Asn 




60 










Pro 


Asp 


Phe 


Phe 


Tyr 


Arg 


75 










80 


Leu 


Met 


Cys 


Thr 


Gly 
95 


Asp 


Phe 


Pro Ser Gly 


His 


Ser 








110 






Ser 


Phe 


Tyr 
125 


Leu 


Ala 


Gly 


Gly 


Lys 
140 


Ser 


Trp 


Arg 


Phe 


Ala 


Val 


He 


Ala 


Leu 


Ser 


155 










160 


Asp 


Leu 


Leu 


Lys 


Cys 
175 


Thr 



<210> 400 

<211> 150 

25 <212> PRT 

<213> Homo sapiens 



<400> 400 
Met Cys Thr Ala 
30 1 

Lys Leu Val Asn 
20 

Met Thr Phe Val 
35 

35 Asp Ser Pro Ala 
50 

Pro Glu Gly Cys 
65 

Pro Pro Glu Val 

40 

Thr Ser Ser Leu 
100 

Ala Gly Val Asn 
115 

45 Gly Cys Leu Pro 
130 

Leu Gly Ser Pro 
145 



Leu Leu Leu Leu 
5 

Val Lys Tyr Glu 

Ala Asp Ala Ala 
40 

Asn Leu Met Ser 
55 

Ser Gly Gly Arg 
70 

Pro Glu Lys Leu 
85 

Thr Asp Thr Asp 

His Ser Ser Ser 
120 

Phe His Leu Ser 
135 

Phe Lys 
150 



Tyr Leu Arg Trp 
10 

Pro Lys Asp Ser 
25 

Arg Gly Pro Leu 

Thr Ala Ser Val 
60 

Ser Pro Cys Tyr 
75 

Thr Ser Leu Gly 
90 

Val Gin Val Ser 
105 

Leu Leu Asp Asn 

Ser Ser Leu Pro 
140 



Cys Phe Asn Leu 
15 

Leu Gly Pro Glu 
30 

Leu Ser Ser Leu 
45 

Cys He Ser Leu 

Ser Gin Lys Trp 
80 

Gin Gin Ser Ser 
95 

Pro Met Leu Val 
110 

He Pro Phe Thr 
125 

Tyr Leu Cys Leu 



50 <210> 401 

<211> 170 

<212> PRT 

<213> Homo sapiens 

55 <400> 401 



Met 


Glu 


Asp 


Pro 


Asn 


Pro 


Glu 


Glu 


Asn 


Met 


Lys 


Gin 


Gin 


Asp 


Ser 


Pro 


1 








5 










10 










15 




Lys 


Glu 


Arg 


Ser 


Pro 


Gin 


Ser 


Pro 


Gly 


Gly 


Asn 


He 


Cys 


His 


Leu 


Gly 








20 










25 










30 






Ala 


Pro 


Lys 


Cys 


Thr 


Arg 


Cys 


Leu 


He 


Thr 


Phe 


Ala 


Asp 


Ser 


Lys 


Phe 






35 










40 










45 








Gin 


Glu 


Arg 


His 


Met 


Lys 


Arg 


Glu 


His 


Pro 


Ala 


Asp 


Phe 


Val 


Ala 


Gin 



50 55 60 
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Lys Leu Gin Gly Val Leu Phe lie Cys Phe Thr Cys Ala Arg Ser Phe 
65 , 70 75 80 

Pro Ser Ser Lys Ala Leu lie Thr His Gin Arg Ser His Gly Pro Ala 
85 90 95 

5 Ala Lys Pro Thr Leu Pro Val Ala Thr Thr Thr Ala Gin Pro Thr Phe 
100 105 110 

Pro Cys Pro Asp Cys Gly Lys Thr Phe Gly Gin Ala Val Ser Leu Arg 

115 120 125 

Arg His Arg Gin Met His Glu Val Arg Ala Pro Pro Gly Thr Phe Ala 
10 130 135 140 

Cys Thr Glu Cys Gly Gin Asp Phe Ala Gin Glu Ala Gly Leu His Gin 
145 150 155 160 

His Tyr lie Arg His Ala Arg Gly Glu Leu 
165 170 



15 



20 



<210> 402 

<211> 169 

<212> PRT 

<213> Homo sapiens 



<400> 402 

Met Glu Asp Pro Asn Pro Glu Glu Asn Met Lys Gin Gin Asp Ser Pro 
15 10 15 

Lys Glu Arg Ser Pro Gin Pro Arg Arg Gin His Leu Pro Pro Gly Gly 
25 20 25 30 

Pro Glu Val His Pro Leu Pro His His Leu Arg Arg Phe Gin Val Pro 

35 40 45 

Gly Ala Ser His Glu Ala Gly Ala Pro Ser Gly Leu Arg Gly Pro Glu 
50 55 60 

30 Ala Ala Gly Gly Pro Leu His Leu Leu His Leu Arg Pro Leu Leu Pro 
65 70 75 80 

Leu Leu Gin Ser Pro Asn His Pro Pro Ala Gin His Gly Pro Ala Ala 

85 90 95 

Lys Pro Thr Leu Pro Val Ala Thr Thr Thr Ala Gin Pro Thr Phe Pro 
35 100 105 110 

Cys Pro Asp Cys Gly Lys Thr Phe Gly Gin Ala Val Ser Leu Arg Arg 

115 120 125 

His Arg Gin Met His Glu Val Arg Ala Pro Pro Gly Thr Phe Ala Cys 
130 135 140 

40 Thr Glu Cys Gly Gin Asp Phe Ala Gin Glu Ala Gly Leu His Gin His 
145 150 155 160 

Tyr lie Arg His Ala Arg Gly Glu Leu 
165 



45 <210> 403 

<211> 367 

<212> PRT 

<213> Homo sapiens 

50 <400> 403 



Met 


Ala 


Thr 


Pro 


Asn 


Asn 


Leu 


Thr 


Pro 


Thr 


Asn 


Cys 


Ser 


Trp 


Trp 


Pro 


1 








5 










10 










15 




He 


Ser 


Ala 


Leu 


Glu 


Ser 


Asp 


Ala 


Ala 


Lys 


Pro 


Ala 


Glu 


Ala 


Pro 


Asp 








20 










25 










30 






Ala 


Pro 


Glu 


Ala 


Ala 


Ser 


Pro 


Ala 


His 


Trp 


Pro 


Arg 


Glu 


Ser 


Leu 


Val 






35 










40 










45 








Leu 


Tyr 


His 


Trp 


Thr 


Gin 


Ser 


Phe 


Ser 


Ser 


Gin 


Lys 


Val 


Arg 


Leu 


Val 




50 










55 










60 










He 


Ala 


Glu 


Lys 


Gly 


Leu 


Val 


Cys 


Glu 


Glu 


Arg 


Asp 


Val 


Ser 


Leu 


Pro 


65 










70 










75 










80 


Gin 


Ser 


Glu 


His 


Lys 


Glu 


Pro 


Trp 


Phe 


Met 


Arg 


Leu 


Asn 


Leu 


Gly Glu 










85 










90 










95 




Glu 


Val 


Pro 


Val 


He 


He 


His 


Arg 


Asp 


Asn 


He 


He 


Ser 


Asp 


Tyr 


Asp 
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100 105 110 



15 



25 



Gin 


He 


He 


Asp 


Tyr 


Val 


Glu 


Arg 


Thr 


Phe 


Thr .Gly 


Glu 


His 


Val 


Val 






115 










120 










125 








Ala 


Leu 


Met 


Pro 


Glu 


Val 


Gly 


Ser 


Leu 


Gin 


His 


Ala 


Arg 


Val 


Leu 


Gin 




130 










135 










140 










Tyr 


Arg 


Glu 


Leu 


Leu 


Asp 


Ala 


Leu 


Pro 


Met 


Asp 


Ala 


Tyr 


Thr 


His 


Gly 


145 










150 










155 










160 


Cys 


He 


Leu 


His 


Leu 


Glu 


Leu 


Thr 


Thr 


Asp 


Ser 


Met 


He 


Pro 


Lys 


Tyr 










165 










170 










175 




Ala 


Thr 


Ala 


Glu 


lie 


Arg 


Arg 


His 


Leu 


Ala 


Asn 


Ala 


Thr 


Thr 


Asp 


Leu 








180 










185 










190 






Met 


Lys 


Leu 


Asp 


His 


Glu 


Glu 


Glu 


Pro 


Gin 


Leu 


Ser 


Glu 


Pro 


Tyr 


Leu 






195 










200 










205 








Ser 


Lys 


Gin 


Lys 


Lys 


Leu 


Met 


Ala 


Lys 


He 


Leu 


Glu 


His 


Asp 


Asp 


Val 




210 










215 










220 










Ser 


Tyr 


Leu 


Lys 


Lys 


He 


Leu 


Gly Glu 


Leu 


Ala 


Met 


Val 


Leu 


Asp 


Gin 


225 










230 










235 










240 


He 


Glu 


Ala 


Glu 


Leu 


Glu 


Lys 


Arg 


Lys 


Leu 


Glu 


Asn 


Glu 


Gly Gin 


Lys 










245 










250 










255 




Cys 


Glu 


Leu 


Trp 


Leu 


Cys 


Gly 


Cys 


Ala 


Phe 


Thr 


Leu 


Ala 


Asp 


Val 


Leu 








260 










265 










270 






Leu 


Gly 


Ala 


Thr 


Leu 


His 


Arg 


Leu 


Lys 


Phe 


Leu Gly 


Leu 


Ser 


Lys 


Lys 






275 










280 










285 








Tyr 


Trp 


Glu 


Asp 


Gly 


Ser 


Arg 


Pro 


Asn 


Leu 


Gin 


Ser 


Phe 


Phe 


Glu 


Arg 




290 










295 










300 










Val 


Gin 


Arg 


Arg 


Phe 


Ala 


Phe 


Arg 


Lys 


Val 


Leu 


Gly 


Asp 


He 


His 


Thr 


305 










310 










315 










320 


Thr 


Leu 


Leu 


Ser 


Ala 


Val 


He 


Pro 


Asn 


Ala 


Phe 


Arg 


Leu 


Val 


Lys 


Arg 










325 










330 










335 




Lys 


Pro 


Pro 


Ser 


Phe 


Phe 


Gly Ala 


Ser 


Phe 


Leu 


Met 


Gly 


Ser 


Leu 


Gly 








340 










345 










350 






Gly 


Met 


Gly Tyr 


Phe 


Ala 


Tyr 


Trp 


Tyr 


Leu 


Lys 


Lys 


Lys 


Tyr 


He 





355 360 365 



35 <210> 404 
<211> 20 
<212> PRT 

<213> Homo sapiens 
40 <400> 404 

Met Ala Ala Ala Arg Pro Ser Leu Gly Arg Val Leu Pro Gly Ser Ser 
15 10 15 

Pro Val Pro Val 
20 



45 



50 



<210> 405 

<211> 225 

<212> PRT 

<213> Homo sapiens 



<400> 405 

Met Ala Thr His Ala Leu Glu He Ala Gly Leu Phe Leu Gly Gly Val 
15 10 15 

Gly Met Val Gly Thr Val Ala Val Thr Val Met Pro Gin Trp Arg Val 
55 * 20 25 30 

Ser Ala Phe He Glu Asn Asn He Val Val Phe Glu Asn Phe Trp Glu 

35 40 45 

Gly Leu Trp Met Asn Cys Val Arg Gin Ala Asn He Arg Met Gin Cys 
50 55 60 

60 Lys He Tyr Asp Ser Leu Leu Ala Leu Ser Pro Asp Leu Gin Ala Ala 
65 70 75 80 

Arg Gly Leu Met Cys Ala Ala Ser Val Met Ser Phe Leu Ala Phe Met 
8 5 90 95 
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Met 


Ala 


He 


Leu 


Gly Met 


Lys 










100 










Lys 


Val 


Lys 
115 


Ala 


His 


He 


Leu 


5 


Thr 


Gly 
130 


Met 


Val 


Val 


Leu 


He 
135 




He 


Arg 


Asp 


Phe 


Tyr 


Asn 


Ser 




145 










150 






Leu 


Gly 


Glu 


Ala 


Leu 


Tyr 


Leu 


10 










165 








Val 


Gly 


Gly 


Ala 
180 


Leu 


Phe 


Cys 




Ser 


Ser 


Tyr 
195 


Arg 


Tyr 


Ser 


He 


15 


Tyr 


His 
210 


Thr 


Gly 


Lys 


Lys 


Ser 
215 



val 



225 

20 <210> 406 

<211> 378 

<212> PRT 

<213> Homo sapiens 



25 <400> 406 





Met 


Asp 


Pro 


Gly 


Asp 


Asp 


Trp 




1 








5 








Asp 


Phe 


Tyr 


Ala 


Phe 


Asp 


Leu 










20 








30 


He 


Asp 


Asp 


Lys 


Gly Val 


Phe 








35 












Asn 


Glu 


He 


Leu 


His 


Leu 


Lys 






50 










55 




Asn 


Lys 


Gly 


Leu 


Phe 


Pro 


Glu 


35 


65 










70 






Phe 


Ser 


Asp 


Arg 


Ser 


He 


Phe 












85 








Leu 


Leu 


Val 


Thr 


Ser 


Gly 


Leu 










100 








40 


Val 


Ala 


Glu 


Asp 


Ser 


Asp 


Val 








115 












His 


Glu 


Lys 


Glu 


Glu 


Ser 


Leu 






130 










135 




Leu 


Ala 


Pro 


Gly 


Val 


Leu 


His 


45 


145 










150 






Val 


Asp 


Leu 


Glu 


Ser 


Arg 


Lys 












165 








Ser 


Glu 


Glu 


Leu 


Ser 


Ser 


Leu 










180 








50 


Phe 


Cys 


Cys 


Ala 


Ser 


Gly 


Arg 








195 












Trp 


Ala 


Pro 


Leu 


Glu 


Asn 


Arg 






210 










215 




Arg 


Trp 


Cys 


Ala 


Glu 


Val 


Gly 


55 


225 










230 






He 


Ala 


Ser 


Leu 


Ser 


Ser 


Asp 












245 








Asp 


Leu 


Cys 


His 


Pro 


Val 


Ser 










260 








60 


Ser 


Pro 


Asp 


Pro 


Glu 


Leu 


Leu 








275 












Asn 


Cys 


Leu 


Ala 


He 


Ser 


Gly 






290 










295 



Cys 


Thr 


Arg 


Cys 


Thr 


Gly Asp 


Asn 


Glu 




105 










110 






Leu 


Thr 


Ala 


Gly 


He 


He 


Phe 


He 


He 


120 










125 








Pro 


Val 


Ser 


Trp 


Val 


Ala 


Asn 


Ala 


He 










140 










He 


Val 


Asn 


Val 


Ala 


Gin 


Lys 


Arg 


Glu 








155 










160 


Gly 


Trp 


Thr 


Thr 


Ala 


Leu 


Val 


Leu 


He 






170 










175 




Cys 


Val 


Phe 


Cys 


Cys 


Asn 


Glu 


Lys 


Ser 




185 










190 






Pro 


Ser 


His 


Arg 


Thr 


Thr 


Gin 


Lys 


Ser 


200 










205 








Pro 


Ser 


Val 


Tyr 


Ser 


Arg 


Ser 


Gin 


Tyr 



220 



Leu 


Val 


Glu 


Ser 


Leu 


Arg 


Leu 


Tyr 


Gin 






10 










15 




Ser 


Gly 


Ala 


Thr 


Arg 


Val 


Leu 


Glu 


Trp 




25 










30 






Val 


Ala 


Gly 


Tyr 


Glu 


Ser 


Leu 


Lys 


Lys 


40 










45 








Leu 


Pro 


Leu 


Arg 


Leu 


Ser 


Val 


Lys 


Glu 










60 










Arg 


Asp 


Phe 


Lys 


Val 


Arg 


His 


Gly 


Gly 








75 










80 


Asp 


Leu 


Lys 


His 


Val 


Pro 


His 


Thr 


Arg 






90 










95 




Pro 


Gly 


Cys 


Tyr 


Leu 


Gin 


Val 


Trp 


Gin 




105 










110 






He 


Lys 


Ala 


Val 


Ser 


Thr 


He 


Ala 


Val 


120 










125 








Trp 


Pro 


Arg 


Val 


Ala 


Val 


Phe 


Ser 


Thr 










140 










Gly 


Ala 


Arg 


Leu 


Arg 


Ser 


Leu 


Gin 


Val 








155 










160 


Thr 


Thr 


Tyr 


Thr 


Ser 


Asp 


Val 


Ser 


Asp 






170 










175 




Gin 


Val 


Leu 


Asp 


Ala 


Asp 


Thr 


Phe 


Ala 




185 










190 






Leu 


Gly 


Leu 


Val 


Asp 


Thr 


Arg 


Gin 


Lys 


200 










205 








Ser 


Pro 


Gly 


Pro 


Gly 


Ser Gly 


Gly 


Glu 










220 










Ser 


Trp 


Gly 


Gin 


Gly 


Pro Gly 


Pro 


Ser 








235 










240 


Gly 


Arg 


Leu 


Cys 


Leu 


Leu 


Asp 


Pro 


Arg 






250 










255 




Ser 


Val 


Gin 


Cys 


Pro 


Val 


Ser 


Val 


Pro 




265 










270 






Arg 


Val 


Thr 


Trp 


Ala 


Pro Gly 


Leu 


Lys 


280 










285 








Phe 


Asp 


Gly 


Thr 


Val 


Gin 


Val 


Tyr 


Asp 



300 
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Ala Thr Ser Trp Asp Gly Thr Arg 
305 310 
Val Glu Pro Leu Phe Thr His Arg 
325 

5 Gly Met Asp Pro Ala Pro Leu Val 
340 

Arg Pro Arg Thr Leu Leu Ser Ala 
355 360 
Trp Asp Trp Val Asp Leu Cys Ala 
10 370 375 



Ser 


Gin 


Asp 


Gly 


Thr 


Arg 


Ser 


Gin 






315 










320 


Gly 


His 


lie 


Phe 


Leu 


Asp 


Gly 


Asn 




330 










335 




Thr 


Thr 


His 


Thr 


Trp 


His 


Pro 


Cys 


345 










350 






Thr 


Asn 


Asp 


Ala 


Ser 


Leu 


His 


Val 



365 



Pro Arg 



<210> 407 
<211> 43 
<212> PRT 
15 <213> Homo sapiens 

<400> 407 

Met Ala Thr His Ala Leu. Glu lie Ala Gly Leu Phe Leu Gly Gly Val 
15 10 15 

20 Gly Met Val Gly Thr Val Ala Val Thr Val Met Pro Gin Trp Arg Val 
20 25 30 

Ser Ala Phe lie Glu Asn Asn lie Val Val Phe 
35 40 

25 <210> 408 
<211> 345 
<212> PRT 

<213> Homo sapiens 
30 <400> 408 

Met Ala Trp Arg Gly Trp Ala Gin Arg Gly Trp Gly Cys Gly Gin Ala 
1 5 10 15 

Trp Gly Ala Ser Val Gly Gly Arg Ser Cys Glu Glu Leu Thr Ala Val 
20 25 30 

35 Leu Thr Pro Pro Gin Leu Leu Gly Arg Arg Phe Asn Phe Phe lie Gin 
35 40 45 

Gin Lys Cys Gly Phe Arg Lys Ala Pro Arg Lys Val Glu Pro Arg Arg 

50 55 60 

Ser Asp Pro Gly Thr Ser Gly Glu Ala Tyr Lys Arg Ser Ala Leu lie 
40 65 70 75 80 

Pro Pro Val Glu Glu Thr Val Phe Tyr Pro Ser Pro Tyr Pro lie Arg 

85 90 95 

Ser Leu lie Lys Pro Leu Phe Phe Thr Val Gly Phe Thr Gly Cys Ala 
100 105 110 

45 Phe Gly Ser Ala Ala lie Trp Gin Tyr Glu Ser Leu Lys Ser Arg Val 
115 120 125 

Gin Ser Tyr Phe Asp Gly lie Lys Ala Asp Trp Leu Asp Ser lie Arg 

130 135 140 

Pro Gin Lys Glu Gly Asp Phe Arg Lys Glu lie Asn Lys Trp Trp Asn 
50 145 150 155 160 

Asn Leu Ser Asp Gly Gin Arg Thr Val Thr Gly lie lie Ala Ala Asn 

165 170 175 

Val Leu Val Phe Cys Leu Trp Arg Val Pro Ser Leu Gin Arg Thr Met 
180 185 190 

55 lie Arg Tyr Phe Thr Ser Asn Pro Ala Ser Lys Val Leu Cys Ser Pro 
195 200 205 

Met Leu Leu Ser Thr Phe Ser His Phe Ser Leu Phe His Met Ala Ala 

210 215 220 

Asn Met Tyr Val Leu Trp Ser Phe Ser Ser Ser lie Val Asn lie Leu 
60 225 230 235 240 

Gly Gin Glu Gin Phe Met Ala Val Tyr Leu Ser Ala Gly Val lie Ser 

245 250 255 

Asn Phe Val Ser Tyr Val Gly Lys Val Ala Thr Gly Arg Tyr Gly Pro 
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260 265 270 





Ser 


Leu 


Gly 


Ala 


Ala 


Leu 


Lys 


Ala 


He 


He 


Ala 


Met 


Asp 


Thr 


Ala 


Gly 








275 










280 










285 










Met 


He 


Leu 


Gly 


Trp 


Lys 


Phe 


Phe 


Asp 


His 


Ala 


Ala 


His 


Leu 


Gly 


Gly 


5 




290 










295 










300 












Ala 


Leu 


Phe 


Glv 


He 


Trp 


Tyr 


Val 


Thr Tyr Gly His 


Glu 


Leu 


He 


Trp 




305 










310 










315 










320 




Lys 


Asn 


Arg 


Glu 


Pro 


Leu 


Val 


Lys 


He 


Trp 


His 


Glu 


He 


Arg 


Thr 


Asn 












325 










330 










335 




10 


Gly 


Pro 


Lys 


Lys 


Gly 


Gly Gly 


Ser 


Lys 
























340 










345 


















<210> 409 






























<211> 236 




























15 


<212> PRT 






























<213> Homo sapiens 


























<400> 409 






























Met 


Lys 


Arg 


Ser 


Gly 


Asn 


Pro 


Gly 


Ala 


Glu 


Val 


Thr 


Asn 


Ser 


Ser 


Val 


20 


1 








5 










10 










15 






Ala 


Gly 


Pro 


Asp 


Cys 


Cys 


Gly 


Gly 


Leu 


Gly Asn 


He 


Asp 


Phe 


Arg 


Gin 










20 










25 










30 








Ala 


Asp 


Phe 


Cys 


Val 


Met 


Thr 


Arg 


Leu 


Leu 


Gly 


Tyr 


Val 


Asp 


Pro 


Leu 








35 










40 










45 








25 


Asp 


Pro 


Ser 


Phe 


Val 


Ala 


Ala 


Val 


He 


Thr 


He 


Thr 


Phe 


Asn 


Pro 


Leu 






50 










55 










60 












Tyr 


Trp 


Asn 


Val 


Val 


Ala 


Arg 


Trp 


Glu 


His 


Lys 


Thr 


Arg 


Lys 


Leu 


Ser 




65 










70 










7 5 










80 




Arg 


Ala 


Phe 


Gly 


Ser 


Pro 


Tyr 


Leu 


Ala 


Cys 


Tyr 


Ser 


Leu 


Ser 


He 


Thr 


30 










85 










90 










95 






He 


Leu 


Leu 


Leu 


Asn 


Phe 


Leu 


Arg 


Ser 


His 


Cys 


Phe 


Thr 


Gin 


Ala 


Met 










100 










105 










110 








Leu 


Ser 


Gin 


Pro 


Arg 


Met 


Glu 


Ser 


Leu 


Asp 


Thr 


Pro 


Ala 


Ala 


Tyr 


Ser 








115 










120 










125 








35 


Leu 


Val 


Leu 


Ala 


Leu 


Leu 


Gly 


Leu 


Gly Val 


Val 


Leu 


Val 


Leu 


Ser 


Ser 






130 










135 










140 












Phe 


Phe 


Ala 


Leu 


Gly 


Phe 


Ala 


Gly 


Thr 


Phe 


Leu 


Gly 


Asp 


Tyr 


Phe 


Gly 




145 










150 










155 










160 




He 


Leu 


Lys 


Glu 


Ala 


Arg 


va J. 


Thr 


Val 


Phe 


Pro 


Phe 


Asn 


He 


Leu 


Asp 


40 










165 










170 










175 






Asn 


Pro 


Met 


Tyr 


Trp 


Gly 


Ser 


Thr 


Ala 


Asn 


Tyr 


Leu 


Gly 


Trp 


Ala 


He 










180 










185 










190 








Met 


His 


Ala 


Ser 


Pro 


Thr 


Gly 


Leu 


Leu 


Leu 


Thr 


Val 


Leu 


Val 


Ala 


Leu 








195 










200 










205 








45 


Thr 


Tyr 


He 


Val 


Ala 


Leu 


Leu 


Tyr 


Glu 


Glu 


Pro 


Phe 


Thr 


Ala 


Glu 


He 






210 










215 










220 












Tyr 


Arg 


Gin 


Lys 


Ala 


Ser 


Gly 


Ser 


His 


Lys 


Arg 


Ser 












225 










230 










235 












50 


<210> 410 






























<211> 121 






























<212> PRT 






























<213> Homo i 


sapiens 
























55 


<400> 410 






























Met 


Asn 


Thr 


Glu 


Ala 


Glu 


Gin 


Gin 


Leu 


Leu 


His 


His 


Ala 


Arg 


Asn 


Gly 




1 








5 










10 
















Asn 


Ala 


Glu 


Glu 


Val 


Arg 


Gin 


Leu 


Leu 


Glu 


Thr 


Met 


Ala 


Ser 


Asn 


Glu 










20 










25 










30 






60 


Val 


He 


Ala 


Asp 


He 


Asn 


Cys 


Lys Gly Arg 


Ser 


Lys 


Ser 


Asn 


Leu 


Gly 








35 










40 










45 










Trp 


Thr 


Pro 


Leu 


His 


Leu 


Ala 


Cys 


Tyr 


Phe 


Gly 


His 


Arg 


Gin 


Val 


Val 



50 55 60 
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Gin Asp Leu Leu 
65 

Gly Asp Thr Pro 

5 lie lie Leu Cys 
100 

Thr He Val Phe 
115 



Lys Ala Gly Ala 
70 

Leu His Arg Ala 
85 

Ser Met Phe Val 

Ser Val He Thr 
120 



Glu Val Asn Val 
75 

Ala Phe Thr Gly 
90 

Ser Glu Val Phe 

105 

He 



Leu Asn Asp Met 
80 

Arg Lys Val Lys 
95 

Gly Gly Val Val 
110 



10 <210> 411 
<211> 170 
<212> PRT 
<213> Homo sapiens 



15 <400> 411 





Met 


Arg 


Leu 


Gin 


Gly Ala 


He 


Phe 


Val 


Leu 


Leu 


Pro 


His 


Leu 


Gly 


Pro 




1 








5 










10 










15 






He 


Leu 


Val 


Trp 


Leu 


Phe 


Thr 


Arg 


Asp 


His 


Met 


Ser 


Gly Trp 


Cys 


Glu 










20 










25 










30 






20 


Gly 


Pro 


Arg 
35 


Met 


Leu 


Ser 


Trp 


Cys 
40 


Pro 


Phe 


Tyr 


Lys 


Val 
45 


Leu 


Leu 


Leu 




Val 


Gin 
50 


Thr 


Ala 


He 


Tyr 


Ser 
55 


Val 


Val 


Gly 


Tyr 


Ala 
60 


Ser 


Tyr 


Leu 


val 




Trp 


Lys 


Asp 


Leu 


Gly 


Gly 


Gly 


Leu 


Gly 


Trp 


Pro 


Leu 


Ala 


Leu 


Pro 


Leu 


25 


65 










70 










75 










80 




Gly 


Leu 


Tyr 


Ala 


Val 
85 


Gin 


Leu 


Thr 


He 


Ser 
90 


Trp 


Thr 


Val 


Leu 


Val 
95 


Leu 




Phe 


Phe 


Thr 


Val 
100 


His 


Asn 


Pro 


Gly 


Leu 
105 


Ala 


Leu 


Leu 


His 


Leu 
110 


Leu 


Leu 


30 


Leu 


Tyr 


Gly 
115 


Leu 


Val 


Val 


Ser 


Thr 
120 


Ala 


Leu 


He 


Trp 


His 
125 


Pro 


He 


Asn 




Lys 


Leu 
130 


Ala 


Ala 


Leu 


Leu 


Leu 
135 


Leu 


Pro 


Tyr 


Leu 


Ala 
140 


Trp 


Leu 


Thr 


Val 




Thr 


Ser 


Ala 


Leu 


Thr 


Tyr 


His 


Leu 


Trp 


Arg 


Asp 


Ser 


Leu 


Cys 


Pro 


Val 


35 


145 










150 










155 










160 




His 


Gin 


Pro 


Gin 


Pro 
165 


Thr 


Glu 


Lys 


Ser 


Asp 
170 















<210> 412 

40 <211> 236 

<212> PRT 

<213> Homo sapiens 



<400> 412 

45 Met Leu Ser Lys Gly Leu Lys Arg Lys Arg Glu Glu Glu Glu Glu Lys 
1 5 10 15 

Glu Pro Leu Ala Val Asp Ser Trp Trp Leu Asp Pro Gly His Thr Ala 

20 25 30 

Val Ala Gin Ala Pro Pro Ala Val Ala Ser Ser Ser Leu Phe Asp Leu 
50 35 40 45 

Ser Val Leu Lys Leu His His Ser Leu Gin Gin Ser Glu Pro Asp Leu 

50 55 60 

Arg His Leu Val Leu Val Val Asn Thr Leu Arg Arg He Gin Ala Ser 
65 70 75 80 

55 Met Ala Pro Ala Ala Ala Leu Pro Pro Val Pro Ser Pro Pro Ala Ala 

85 90 95 

Pro Ser Val Ala Asp Asn Leu Leu Ala Ser Ser Asp Ala Ala Leu Ser 

100 105 110 

Ala Ser Met Ala Ser Leu Leu Glu Asp Leu Ser His He Glu Gly Leu 
60 115 120 125 

Ser Gin Ala Pro Gin Pro Leu Ala Asp Glu Gly Pro Pro Gly Arg Ser 

130 . 135 140 

He Gly Gly Ala Ala Pro Ser Leu Gly Ala Leu Asp Leu Leu Gly Pro 
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145 

Ala Thr Gly Cys 

lie Asp Thr Ser 
5 180 
Gly Leu Lys Pro 
195 

Glu Leu Asp Glu 
210 

10 Thr Gin Ala Leu 
225 



150 

Leu Leu Asp Asp 
165 

Met Tyr Asp Asn 

Gly Pro Glu Asp 
200 

Ala Glu Leu Asp 
215 

Glu Arg Pro Pro 
230 



155 

Gly Leu Glu Gly 
170 

Glu Leu Trp Ala 
185 

Gly Pro Gly Lys 

Tyr Leu Met Asp 
220 

Gly Pro Gly Arg 
235 



160 

Leu Phe Glu Asp 
175 

Pro Ala Ser Glu 
190 

Glu Glu Ala Pro 
205 

Val Leu Val Gly 



<210> 413 
<211> 191 
15 <212> PRT 

<213> Homo sapiens 



<400> 413 



Met 


Lys 


Gly 


Leu 


Tyr 


Phe 


Gin 


Gin 


1 








5 








Phe 


Val 


Phe 


Gin 


Glu 


Lys 


Glu 


Asp 








20 










Val 


Lys 


Leu 


Gin 


Val 


Lys 


Ala 


Cys 






35 










40 


Leu 


Leu 


Ala 


Glu 


Met 


Lys 


Met 


Lys 




50 










55 




Glu 


He 


Ala 


Gly 


He 


Val 


Leu 


Asp 


65 










70 






Gin 


Pro 


Asp 


Asp 


Glu 


Val 


Val 


Gly 










85 








Pro 


Gly 


Leu 


Cys 


Glu 


Val 


Val 


Arg 








100 










Lys 


Pro 


Glu 


Lys 


Val 


Thr 


Trp 


Thr 






115 










120 


Gly 


Val 


Arg 


Ala 


Tyr 


Thr 


Ala 


Leu 




130 










135 




Gly 


Lys 


Ser 


Val 


Leu 


He 


Met 


Asp 


145 










150 






Ala 


He 


Gin 


Leu 


Ala 


His 


His 


Arg 










165 








Ala 


Ala 


Leu 


Lys 


He 


Ser 


Ser 


Ala 



180 



Ser 


Ser 


Thr 


Asp 


Glu 


Glu 


He 


Thr 




10 










15 




Leu 


Pro 


Val 


Thr 


Glu 


Asp 


Asn 


Phe 


25 










30 






Ala 


Leu 


Ser 


Gin 


He 


Asn 


Thr 


Lys 










45 








Lys 


Asp 


Leu 


Phe 


Pro 


Val 


Gly 


Arg 








60 










Val 


Gly 


Ser 


Lys 


Val 


Ser 


Phe 


Phe 






75 










80 


He 


Leu 


Pro 


Leu 


Asp 


Ser 


Glu 


Asp 




90 










95 




Val 


His 


Glu 


His 


Tyr 


Leu 


Val 


His 


105 










110 






Glu 


Ala 


Ala 


Gly 


Ser 


He 


Arg 


Asp 










125 








His 


Tyr 


Leu 


Ser 


His 


Leu 


Ser 


Pro 








140 










Gly 


Ala 


Ser 


Ala 


Phe 


Gly Thr 


He 






155 










160 


Gly 


Ala 


Lys 


Val 


Phe 


Gin 


Gin 


His 




170 










175 




Leu 


Lys 


Asp 


Ser 


Asp 


Leu 


Pro 




185 










190 







<210> 414 

45 <211> 389 

<212> PRT 

<213> Homo sapiens 



<400> 414 



50 


Met 
1 


Ala Glu 


Pro 


Asp 
5 


Pro 


Ser 


His 


Pro 


Leu 
10 


Glu 


Thr 


Gin 


Ala 


Gly 
15 


Lys 




val 


Gin Glu 


Ala 


Gin 


Asp 


Ser 


Asp 


Ser 


Asp 


Ser 


Glu 


Gly Gly Ala 


Ala 








20 










25 










30 








Gly Gly Glu 


Ala 


Asp 


Met 


Asp 


Phe 


Leu 


Arg 


Asn 


Leu 


Phe 


Ser 


Gin 


Thr 


55 




35 










40 










45 










Leu 


Ser Leu 
50 


Gly 


Ser 


Gin 


Lys 
55 


Glu 


Arg 


Leu 


Leu 


Asp 
60 


Glu 


Leu 


Thr 


Leu 




Glu 


Gly Val 


Ala 


Arg 


Tyr 


Met 


Gin 


Ser 


Glu 


Arg 


Cys 


Arg 


Arg 


Val 


He 




65 








70 










75 










80 


60 


Cys 


Leu Val 


Gly 


Ala 
85 


Gly 


He 


Ser 


Thr 


Ser 
90 


Ala 


Gly 


He 


Pro 


Asp 
95 


Phe 




Arg 


Ser Pro 


Ser 
100 


Thr 


Gly 


Leu 


Tyr 


Asp 
105 


Asn 


Leu 


Glu 


Lys 


Tyr 
110 


His 


Leu 
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Pro Tyr Pro Glu Ala lie Phe Glu He Ser Tyr Phe Lys Lys His Pro 

115 120 125 

Glu Pro Phe Phe Ala Leu Ala Lys Glu Leu Tyr Pro Gly Gin Phe Lys 
130 135 140 

5 Pro Thr He Cys His Tyr Phe Met Arg Leu Leu Lys Asp Lys Gly Leu 
145 150 155 160 

Leu Leu Arg Cys Tyr Thr Gin Asn He Asp Thr Leu Glu Arg He Ala 

165 170 175 

Gly Leu Glu Gin Glu Asp Leu Val Glu Ala His Gly Thr Phe Tyr Thr 
10 180 185 190 

Ser His Cys Val Ser Ala Ser Cys Arg His Glu Tyr Pro Leu Ser Trp 

195 200 205 

Met Lys Glu Lys He Phe Ser Glu Val Thr Pro Lys Cys Glu Asp Cys 
210 215 220 

15 Gin Ser Leu Val Lys Pro Asp He Val Phe Phe Gly Glu Ser Leu Pro 
225 230 235 240 

Ala Arg Phe Phe Ser Cys Met Gin Ser Asp Phe Leu Lys Val Asp Leu 

245 250 255 

Leu Leu Val Met Gly Thr Ser Leu Gin Val Gin Pro Phe Ala Ser Leu 
20 260 265 270 

lie Ser Lys Ala Pro Leu Ser Thr Pro Arg Leu Leu He Asn Lys Glu 

275 280 285 

Lys Ala Gly Gin Ser Asp Pro Phe Leu Gly Met He Met Gly Leu Gly 
290 295 300 

25 Gly Gly Met Asp Phe Asp Ser Lys Lys Ala Tyr Arg Asp Val Ala Trp 
305 310 315 320 

Leu Gly Glu Cys Asp Gin Gly Cys Leu Ala Leu Ala Glu Leu Leu Gly 

325 330 335 

Trp Lys Lys Glu Leu Glu Asp Leu Val Arg Arg Glu His Ala Ser He 
30 340 345 350 

Asp Ala Gin Ser Gly Ala Gly Val Pro Asn Pro Ser Thr Ser Ala Ser 

355 360 365 

Pro Lys Lys Ser Pro Pro Pro Ala Lys Asp Glu Ala Arg Thr Thr Glu 
370 375 380 

35 Arg Glu Lys Pro Gin 
385 

<210> 415 
<211> 481 
40 <212> PRT 

<213 > Homo sapiens 

<400> 415 

Met Ser Leu Asn Leu Pro Glu Ala Ser Leu Leu Ser Arg Ala Ser Trp 
45 1 5 10 15 

Pro Glu Gin Ala Lys Glu Pro Arg Arg Glu Gly His Thr Asp Lys Gin 

20 25 30 

Gin Thr Glu Asp Val Leu Ala Ala Gly Leu Arg Cys Leu Pro His Leu 
35 40 45 

50 Pro Ala He Cys Ala Arg Arg Met Ser Pro Ala Phe Arg Ala Met Asp 
50 55 60 

Val Glu Pro Arg Ala Lys Gly Val Leu Leu Glu Pro Phe Val His Gin 
65 70 75 80 

Val Gly Gly His Ser Cys Val Leu Arg Phe Asn Glu Thr Thr Leu Cys 
55 85 90 95 

Lys Pro Leu Val Pro Arg Glu His Gin Phe Tyr Glu Thr Leu Pro Ala 

100 105 110 

Glu Met Arg Lys Phe Thr Pro Gin Tyr Lys Gly Val Val Ser Val Arg 
115 120 125 

60 Phe Glu Glu Asp Glu Asp Arg Asn Leu Cys Leu He Ala Tyr Pro Leu 
130 135 140 

Lys Gly Asp His Gly He Val Asp He Val Asp Asn Ser Asp Cys Glu 
145 150 155 160 
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Pro 


Lys 


Ser 


Lys 


Leu 


Leu 


Arg 


Trp 


Thr 


Thr 


Asn 


Lys 


Lys 


His 


His 


Val 










165 










170 










175 




Leu 


Glu 


Thr 


Glu 


Lys 


Thr 


Pro 


Lys 


Asp 


Trp 


Val 


Arg 


Gin 


His 


Arg 


Lys 








180 










185 










190 






Glu 


Glu 


Lys 


Met 


Lys 


Ser 


His 


Lys 


Leu 


Glu 


Glu 


Glu 


Phe 


Glu 


Trp 


Leu 






195 










200 










205 








Lys 


Lys 


Ser 


Glu 


Val 


Leu 


Tyr 


Tyr 


Thr 


Val 


Glu 


Lys 


Lys 


Gly 


Asn 


He 




210 










215 










220 










Ser 


Ser 


Gin 


Leu 


Lys 


His 


Tyr 


Asn 


Pro 


Trp 


Ser 


Met 


Lys 


Cys 


His 


Gin 


225 










230 










235 










240 


Gin 


Gin 


Leu 


Gin 


Arg 


Met 


Lys 


Glu 


Asn 


Ala 


Lys 


His 


Arg 


Asn 


Gin 


Tyr 










245 










250 










255 




Lys 


Phe 


He 


Leu 


Leu 


Glu 


Asn 


Leu 


Thr 


Ser 


Arg 


Tyr 


Glu 


Val 


Pro 


Cys 








260 










265 










270 






Val 


Leu 


Asp 


Leu 


Lys 


Met 


Gly 


Thr 


Arg 


Gin 


His 


Gly 


Asp 


Asp 


Ala 


Ser 






275 










280 










285 








Glu 


Glu 


Lys 


Ala 


Ala 


Asn 


Gin 


He 


Arg 


Lys 


Cys 


Gin 


Gin 


Ser 


Thr 


Ser 




290 










295 










300 










Ala 


Val 


He 


Gly 


Val 


Arg 


Val 


Cys 


Gly 


Met 


Gin 


Val 


Tyr 


Gin 


Ala 


Gly 


305 










310 










315 










320 


Ser 


Gly 


Gin 


Leu 


Met 


Phe 


Met 


Asn 


Lys 


Tyr 


His 


Gly 


Arg 


Lys 


Leu 


Ser 










325 










330 










335 




Val 


Gin 


Gly 


Phe 


Lys 


Glu 


Ala 


Leu 


Phe 


Gin 


Phe 


Phe 


His 


Asn 


Gly 


Arg 








340 










345 










350 






Tyr 


Leu 


Arg 


Arg 


Glu 


Leu 


Leu 


Gly 


Pro 


Val 


Leu 


Lys 


Lys 


Leu 


Thr 


Glu 






355 










360 










365 








Leu 


Lys 


Ala 


Val 


Leu 


Glu 


Arg 


Gin 


Glu 


Ser 


Tyr 


Arg 


Phe 


Tyr 


Ser 


Ser 




370 










375 










380 










Ser 


Leu 


Leu 


Val 


He 


Tyr 


Asp 


Gly 


Lys 


Glu 


Arg 


Pro 


Glu 


Val 


Val 


Leu 


385 










390 










395 










400 


Asp 


Ser 


Asp 


Ala 


Glu 


Asp 


Leu 


Glu 


Asp 


Leu 


Ser 


Glu 


Glu 


Ser 


Ala 


Asp 










405 










410 










415 




Glu 


Ser 


Ala 


Gly 


Ala 


Tyr 


Ala 


Tyr 


Lys 


Pro 


He 


Gly 


Ala 


Ser 


Ser 


Val 








420 










425 










430 






Asp 


Val 


Arg 


Met 


He 


Asp 


Phe 


Ala 


His 


Thr 


Thr 


Cys 


Arg 


Leu 


Tyr 


Gly 






435 










440 










445 








Glu 


Asp 


Thr 


Val 


Val 


His 


Glu 


Gly 


Gin 


Asp 


Ala 


Gly 


Tyr 


He 


Phe 


Gly 




450 










455 










460 










Leu 


Gin 


Ser 


Leu 


He 


Asp 


He 


Val 


Thr 


Glu 


He 


Ser 


Glu 


Glu 


Ser 


Gly 


465 










470 










475 










480 



Glu 



<210> 416 
<211> 354 
45 <212> PRT 

<213> Homo sapiens 
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Ser 


Ala 
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Gly Gly 


Arg 


Ala 


Phe 


Ala 


Trp 


Gin 


Val 


Phe 


Pro 


Pro 
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5 
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15 




Met 


Pro 


Thr 


Cys 


Arg Val 


Tyr 


Gly 


Thr 


Val 


Ala 


His 


Gin 


Asp 


Gly 


His 








20 








25 










30 






Leu 


Leu 


Val 


Leu 


Gly Gly 


Cys 


Gly 


Arg 


Ala 


Gly 


Leu 


Pro 


Leu 


Asp 


Thr 






35 








40 










45 








Ala 


Glu 


Thr 


Leu 


Asp Met 


Ala 


Ser 


His 


Thr 


Trp 


Leu 


Ala 


Leu 


Ala 


Pro 




50 








55 
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Leu 


Pro 


Thr 


Ala 


Arg Ala 


Gly 


Ala 


Ala 


Ala 


Val 


Val 


Leu 


Gly 


Lys 


Gin 


65 








70 










75 










80 


Val 


Leu 


Val 


Val 


Gly Gly 


Val 


Asp 


Glu 


Val 


Gin 


Ser 


Pro 


Val 


Ala 


Ala 










85 
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Val 


Glu 


Ala 


Phe 


Leu Met 


Asp 


Glu 


Gly 


Arg 


Trp 


Glu 


Arg 


Arg 


Ala 


Thr 
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Leu 


Pro 


Gin 


Ala 


Ala Met 


Gly 


Val 


Ala 


Thr 


Val 


Glu 


Arg 


Asp 


Gly 


Met 
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115 










120 










125 










Val 


Tyr 
130 


Ala 


Leu 


Gly 


Gly 


Met 
135 


Gly 


Pro 


Asp 


Thr 


Ala 
140 


Pro 


Gin 


Ala 


Gin 




Val 


Arg 


Val 


Tyr 


Glu 


Pro 


Arg 


Arg 


Asp 


Cys 


Trp 


Leu 


Ser 


Leu 


Pro 


Ser 


5 


145 










150 










155 










160 




Met 


Pro 


Thr 


Pro 


Cys 


Tyr Gly Ala 


Ser 


Thr 


Phe 


Leu 


His 


Gly Asn 


Lys 












165 










170 










175 






lie 


Tyr 


Val 


Leu 


Gly 


Gly Arg 


Gin 


Gly 


Lys 


Leu 


Pro 


Val 


Thr 


Ala 


Phe 










180 










185 










190 






10 


Glu 


Ala 


Phe 
195 


Asp 


Leu 


Glu 


Ala 


Arg 
200 


Thr 


Trp 


Thr 


Arg 


His 
205 


Pro 


Ser 


Leu 




Pro 


Ser 
210 


Arg 


Arg 


Ala 


Phe 


Ala 
215 


Gly 


Cys 


Ala 


Met 


Ala 
220 


Glu 


Gly 


Ser 


Val 




Phe 


Ser 


Leu 


Gly 


Gly 


Leu 


Gin 


Gin 


Pro 


Gly 


Pro 


His 


Asn 


Phe 


Tyr 


Ser 


15 


225 










230 










235 










240 




Arg 


Pro 


His 


Phe 


Val 
245 


Asn 


Thr 


Val 


Glu 


Met 
250 


Phe 


Asp 


Leu 


Glu 


His 
255 


Gly 




Ser 


Trp 


Thr 


Lys 
260 


Leu 


Pro 


Arg 


Ser 


Leu 
265 


Arg 


Met 


Arg 


Asp 


Lys 
270 


Arg 


Ala 


20 


Asp 


Phe 


Val 


Val 


Gly 


Ser 


Leu Gly Gly His 


lie 


Val 


Ala 


He 


Gly 


Gly 








275 










280 










285 










Leu 


Gly 
290 


Asn 


Gin 


Pro 


Cys 


Pro 
295 


Leu 


Gly 


Ser 


Val 


Glu 
300 


Ser 


Phe 


Ser 


Leu 




Ala 


Arg 


Arg 


Arg 


Trp 


Glu 


Ala 


Leu 


Pro 


Ala 


Met 


Pro 


Thr 


Ala 


Arg 


Cys 


25 


305 










310 










315 










320 




Ser 


Cys 


Ser 


Ser 


Leu 
325 


Gin 


Ala 


Gly 


Pro 


Arg 
330 


Leu 


Phe 


Val 


He 


Gly 
335 


Gly 




Val 


Ala 


Gin 


Gly 
340 


Pro 


Ser 


Gin 


Ala 


Val 
345 


Glu 


Ala 


Leu 


Cys 


Leu 
350 


Arg 


Asp 



30 Gly Val 



<210> 417 

<211> 20 

<212> PRT 

35 <213> Homo sapiens 

<400> 417 

Met Lys Gly Leu Tyr Phe Gin Gin Ser Ser Thr Asp Glu Glu He Thr 
15 10 15 

40 Phe Val Phe Gin 
20 



<210> 418 

<211> 320 

45 <212> PRT 

<213> Homo sapiens 



<400> 418 

Met Lys Gly Leu Tyr Phe Gin Gin 
50 1 5 

Phe Val Phe Gin Glu Lys Glu Asp 
20 

Val Lys Leu Gin Val Lys Ala Cys 
35 40 
55 Leu Leu Ala Glu Met Lys Met Lys 
50 55 
Glu He Ala Gly He Val Leu Asp 
65 70 
Gin Pro Asp Asp Glu Val Val Gly 
60 85 

Pro Gly Leu Cys Glu Val Val Arg 
100 

Lys Pro Glu Lys Val Thr Trp Thr 



Ser 


Ser 


Thr 


Asp 


Glu 


Glu 


He 


Thr 




10 










15 




Leu 


Pro 


Val 


Thr 


Glu 


Asp 


Asn 


Phe 


25 










30 






Ala 


Leu 


Ser 


Gin 


He 


Asn 


Thr 


Lys 










45 








Lys 


Asp 


Leu 


Phe 


Pro 


Val 


Gly 


Arg 








60 










Val 


Gly 


Ser 


Lys 


Val 


Ser 


Phe 


Phe 






75 










80 


He 


Leu 


Pro 


Leu 


Asp 


Ser 


Glu 


Asp 




90 










95 




Val 


His 


Glu 


His 


Tyr 


Leu 


Val 


His 


105 










110 






Glu 


Ala 


Ala 


Gly 


Ser 


He 


Arg 


Asp 
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115 120 125 



15 



25 



Gly 


Val 


Arg 


Ala 


Tyr 


Thr 


Ala 


Leu 


His 


Tyr 


Leu 


Ser 


His 


Leu 


Ser 


Pro 




130 










135 










140 










Gly 


Lys 


Ser 


Val 


Leu 


He 


Met 


Asp Gly Ala 


Ser 


Ala 


Phe 


Gly 


Thr 


He 


145 










150 










155 










160 


Ala 


lie 


Gin 


Leu 


Ala 


His 


His 


Arg Gly Ala 


Lys 


Val 


He 


Ser 


Thr 


Ala 










165 










170 










175 




Cys 


Ser 


Leu 


Glu 


Asp 


Lys 


Gin 


Cys 


Leu 


Glu 


Arg 


Phe 


Arg 


Pro 


Pro 


He 








180 










185 










190 






Ala 


Arg 


Val 


lie 


Asp 


Val 


Ser 


Asn Gly Lys 


Val 


His 


Val 


Ala 


Glu 


Ser 






195 










200 










205 








Cys 


Leu 


Glu 


Glu 


Thr 


Gly 


Gly 


Leu 


Gly 


Val 


Asp 


He 


Val 


Leu 


Asp 


Ala 




210 










215 










220 










Gly 


Val 


Arg 


Leu 


Tyr 


Ser 


Lys 


Asp 


Asp 


Glu 


Pro 


Ala 


Val 


Lys 


Leu 


Gin 


225 










230 










235 










240 


Leu 


Leu 


Pro 


His 


Lys 


His 


Asp 


He 


He 


Thr 


Leu 


Leu 


Gly Val 


Gly Gly 










245 










250 










255 




His 


Trp 


Val 


Thr 


Thr 


Glu 


Glu 


Asn 


Leu 


Gin 


Leu 


Asp 


Pro 


Pro 


Asp 


Ser 








260 










265 










270 






His 


Cys 


Leu 


Phe 


Leu 


Lys 


Gly 


Ala 


Thr 


Leu 


Ala 


Phe 


Leu 


Asn 


Asp 


Glu 






275 










280 










285 








Val 


Trp 


Asn 


Leu 


Ser 


Asn 


Val 


Gin 


Gin 


Gly 


Lys 


Tyr 


Leu 


Tyr 


Leu 


Lys 




290 










295 










300 










Gly 


Cys 


Asp 


Gly 


Glu 


Val 


He 


Asn 


Trp 


Cys 


Phe 


Gin 


Thr 


Ser 


Val 


Gly 


305 










310 










315 










320 



<210> 419 
<211> 159 
<212> PRT 
30 <213> Homo sapiens 

<400> 419 

Met Glu Lys Leu Arg 
1 5 
35 Gly Leu Thr Ala Gin 
20 

Arg Leu Lys Trp Phe 

35 

He Leu Gly Thr Gly 
40 50 

Ala Val Phe Tyr Thr 
65 

Phe Leu Met Gly Pro 
85 

45 Arg Leu Leu Ala Thr 
100 

Cys Ala Ala Leu Trp 
115 

He Leu Gin Phe Leu 
50 130 

Tyr Ala Arg Asp Ala 
14 5 



Arg Val Leu Ser Gly 
10 

Val Leu Asp Ala Ser 
25 

Ala He Cys Phe Val 
40 

Leu Leu Trp Leu Pro 
55 

Leu Gly Asn Leu Ala 
70 

Val Lys Gin Leu Lys 
90 

He Val Met Leu Leu 
105 

Trp His Lys Lys Gly 
120 

Ser Met Thr Trp Tyr 
135 

Val He Lys Cys Cys 
150 



Gin 


Asp 


Asp 


Glu 


Glu 
15 


Gin 


Ser 


Leu 


Ser 


Phe 
30 


Asn 


Thr 


Cys 


Gly 


Val 
45 


Phe 


Phe 


Ser 


Gly 


Gly 
60 


He 


Lys 


Leu 


Phe 


Ala 


Leu 


Ala 


Ser 


Thr 


Cys 


75 










80 


Lys 


Met 


Phe 


Glu 


Ala 
95 


Thr 


Cys 


Phe 


He 


Phe 
110 


Thr 


Leu 


Leu 


Ala 


Val 
125 


Leu 


Phe 


Cys 


Ser 


Leu 
140 


Ser 


Tyr 


He 


Pro 


Ser 


Ser 


Leu 


Leu 


Ser 





155 



<210> 420 
55 <211> 183 
<212> PRT 

<213> Homo sapiens 



<400> 420 

60 Met Glu Gin Arg Leu Ala Glu Phe Arg Ala Ala Arg Lys Arg Ala Gly 
15 10 15 

Leu Ala Ala Gin Pro Pro Ala Ala Ser Gin Gly Ala Gin Thr Pro Gly 
20 25 30 
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10 



Glu 


Lys 


Ala 


Glu 


Ala 


Ala 


Ala 


Thr 


Leu 


Lys 


Ala 


Ala 


Pro 


Gly Trp 


Leu 






35 










40 










45 








Lys 


Arg 


Phe 


Leu 


Val 


Trp 


Lys 


Pro 


Arg 


Pro 


Ala 


Ser 


Ala 


Arg 


Ala 


Gin 




50 










55 










60 










Pro 


Gly 


Leu 


Val 


Gin 


Glu 


Ala 


Ala 


Gin 


Pro 


Gin 


Gly 


Ser 


Thr 


Ser 


Glu 


65 










70 










75 










80 


Thr 


Pro 


Trp 


Asn 


Thr 


Ala 


He 


Pro 


Leu 


Pro 


Ser 


Cys 


Trp 


Asp 


Gin 


Ser 










85 










90 










95 




Phe 


Leu 


Thr 


Asn 


He 


Thr 


Phe 


Leu 


Lys 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


Leu 








100 










105 










110 






Leu 


Gly 


Leu 


Phe 


Val 


Glu 


Leu 


Glu 


Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Val 


Leu 






115 










120 










125 








Ser 


Leu 


Phe 


Tyr 


Trp 


Met 


Tyr 


Val 


Gly 


Thr 


Arg 


Gly 


Pro 


Glu 


Glu 


Lys 




130 










135 










140 










Lys 


Glu 


Gly 


Glu 


Lys 


Ser 


Ala 


Tyr 


Ser 


Val 


Phe 


Asn 


Pro 


Gly 


Cys 


Glu 


145 










150 










155 










160 


Ala 


He 


Gin 


Gly 


Thr 


Leu 


Thr 


Ala 


Glu 


Gin 


Leu 


Glu 


Arg 


Glu 


Leu 


Gin 










165 










170 










175 




Leu 


Arg 


Pro 


Leu 


Ala 


Gly 


Arg 





















20 180 



<210> 421 
<211> 143 
<212> PRT 
25 <213> Homo sapiens 



<400> 421 





Met 


Ala 


Ala 


Pro 


Arg 


Arg 


Gly Arg 


Gly 


Ser 


Ser 


Thr 


Val 


Leu 


Ser 


Ser 




1 








5 










10 










15 




30 


Val 


Pro 


Leu 


Gin 
20 


Met 


Leu 


Phe 


Tyr 


Leu 
25 


Ser 


Gly 


Thr 


Tyr 


Tyr 
30 


Ala 


Leu 




Tyr 


Phe 


Leu 
35 


Ala 


Thr 


Leu 


Leu 


Met 
40 


He 


Thr 


Tyr 


Lys 


Ser 
45 


Gin 


Val 


Phe 




Ser 


Tyr 


Pro 


His 


Arg 


Tyr 


Leu 


Val 


Leu 


Asp 


Leu 


Ala 


Leu 


Leu 


Phe 


Leu 


35 




50 










55 










60 












Met 


Gly 


He 


Leu 


Glu 


Ala 


Val 


Arg 


Leu 


Tyr 


Leu 


Gly 


Thr 


Arg 


Gly 


Asn 




65 










70 










75 










80 




Leu 


Thr 


Glu 


Ala. Glu 


Arg 


Pro 


Leu 


Ala 


Ala 


Ser 


Leu 


Ala 


Leu 


Thr 


Ala 












85 










90 










95 




40 


Gly 


Thr 


Ala 


Leu 
100 


Leu 


Ser 


Ala 


His 


Phe 
105 


Leu 


Leu 


Trp 


Gin 


Ala 
110 


Leu 


Val 




Leu 


Trp 


Ala 
115 


Asp 


Trp 


Ala 


Leu 


Ser 
120 


Ala 


Thr 


Leu 


Leu 


Ala 
125 


Leu 


His 


Gly 




Leu 


Glu 


Ala 


Val 


Leu 


Gin 


Val 


Val 


Ala 


He 


Ala 


Ala 


Phe 


Thr 


Arg 




45 




130 










135 










140 











<210> 422 

<211> 73 

<212> PRT 

50 <213> Homo sapiens 

<400> 422 

Met Ser Gly Val Pro Ala Glu Met 
1 5 
55 Pro Val Val Ser Ser Ser Arg Arg 
20 

Ala Gly Val Ser Ser Lys Gin Glu 

35 40 
Leu Phe Lys Leu Cys Phe His His 
60 50 55 

His Lys Phe His Gly Gin Val Gin 
65 70 



Thr 


Gly 


Ala 


Val 


Glu 


Ala 


Phe 


Leu 




10 










15 




Leu 


Pro 


Arg 


Phe 


Val 


His 


Met 


Val 


25 










30 






Arg 


Ala 


Arg 


Ser 


Asn 


Thr 


Glu 


Ala 










45 








He 


Cys 


Gin 


Cys 


Leu 


Thr 


Asp 


Glu 



60 



Phe 
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<210> 423 
<211> 142 
<212> PRT 
<213> Homo sapiens 

<400> 423 
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Pro 


Pro 


Phe 


Gly 


Gly 


His 


Pro 


Leu 


Ser 


Gin 


Glu 


Glu 


Asp 


Gly 


Ser 
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10 
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Gin 


Arg 


Cys 


Cys 


Cys 


Leu 


Ser 


Ser 


Leu 


Arg 


Ser 


Val 


Asp 


Asp 


Ser 


Asn 
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20 










25 
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Gly 


Glu 


Thr 


Val 


Val 


He 


Met 


Ala 


Leu 


Phe 


Leu 


Ala 


Val 


Ser 


Tyr 


His 








35 










40 










45 










His 


Lys 


Thr 


Gin 


Ser 


Lys 


Arg 


Trp 


Pro 


Gly 


Leu 


Thr 


Pro 


Pro 


His 


Ser 






50 










55 










60 
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Ser 


Leu 


Leu 


Cys 


Arg 


Pro 


Leu 


Gin 


Leu 


Ser 


Phe 


Leu 


Val 


He 


Gin 


Ser 
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70 










75 










80 




Val 


Arg 


Met 


Arg 


Ala 


Cys 


Gly 


Cys 


Asp 


Ser 


Gly 


His 


Cys 


Arg 


He 


Leu 












85 










90 










95 






Gly 


Arg 


Tyr 


Ser 


Leu 


Leu 


Gly 


Trp 


Ser 


Gin 


Gly 


His 


Arg 


Ala 


Arg 


Gly 


20 








100 










105 










110 








Arg 


Gly 


Gly Val 


Ser 


Leu 


Arg 


Asp 


Asn 


Thr 


Phe 


Phe 


Gin 


Glu 


Ala 


Ser 








115 










120 










125 










Glu 


Gly 


Gin 


Gly 


Gin 


Trp 


Leu 


Met 


Pro 


Val 


He 


Pro 


Ala 


Phe 










130 










135 










140 
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<212> PRT 
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Leu 


Lys 


Pro 


Arg 


Arg 
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Trp 


Arg 


Thr 


Ala 
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Leu 


Arg 


Arg 


Tyr 
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Pro 


Thr 


Asp 
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Gin 


Ala 


Pro 


Arg 


Ser 


Pro 


35 








20 
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Val 


Pro 


Pro 


He 


Arg 


Lys 


Val 
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He 


Ser 


Asp 


Val 


He 


Val 


His 


Ala 








35 










40 










45 
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Ala 
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Lys 


Lys 


Asn 


Thr 


Cys 


Asn 


Cys 


Gin 


Ala 


Asp 






50 










55 










60 










40 


Leu 


Leu 


Ser 


Trp 


Arg 


Ser 


Trp 


Val 


Asn 
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He 


Ser 
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His 


Cys 


Pro 




65 










70 
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80 
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Arg 
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Ser 


Lys 


Ser 


He 


Phe 


Arg 
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Ser 


Thr 


Ser 


Leu 












85 










90 
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Cys 


Ser 
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Ser 
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Gin 


Arg 
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Cys 
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Ser 
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Pro 
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Leu 


Phe 


Val 
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Val 
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Ala 
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Phe 


Arg 


Val 


Gin 








115 










120 










125 










Ala 


Gly 
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Arg 


Val 


Gly 
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Lys 


Thr 


Arg 


Val 


Ser 


Arg 
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Thr 


Leu 
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Met 
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Asn 


Arg 


Ser 


Glu 


Leu 


Cys 


Asn 


Phe 
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Leu 


Ser 
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10 
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Leu 


Asn 
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Tyr 


Gly 


Lys 


Gly 


Phe 


Phe 


Ser 


Leu 


Val 


Glu 


Lys 


His 


Asn 
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Ser 


Arg 
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Leu 


Glu 


Asp 


Arg 


Ala 


Ser 


Ser 


Gly 


Pro 


Ser 


Leu 


Ser 


Ser 
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35 40 45 

Pro Ser His Pro Asp Trp Gly Tyr He Val Leu He Leu Val Ala Thr 

50 55 60 

Leu Gly Glu Leu Asp Thr Gin Val Gly Gly His 
5 65 2.0 75 

<210> 426 
<211> 168 
<212> PRT 
10 <213> Homo sapiens 

<400> 426 

Met Arg Leu Thr Glu Lys Ser Glu Gly Glu Gin Gin Leu Lys Pro Asn 
15 10 15 

15 Asn Ser Asn Ala Pro Asn Glu Asp Gin Glu Glu Glu He Gin Gin Ser 
20 25 30 

Glu Gin His Thr Pro Ala Arg Gin Arg Thr Gin Arg Ala Asp Thr Gin 

35 40 45 

Pro Ser Arg Cys Arg Leu Pro Ser Arg Arg Thr Pro Thr Thr Ser Ser 
20 50 55 60 

Asp Arg Thr He Asn Leu Leu Glu Val Leu Pro Trp Pro Thr Glu Trp 
65 70 75 80 

He Phe Asn Pro Tyr Arg Leu Pro Ala Leu Phe Glu Leu Tyr Pro Glu 
85 90 95 

25 Phe Leu Leu Val Phe Lys Glu Ala Phe His Asp He Ser His Cys Leu 
100 105 110 

Lys Ala Gin Met Glu Lys He Gly Leu Pro He He Leu His Leu Phe 

115 120 125 

Ala Leu Ser Thr Leu Tyr Phe Tyr Lys Phe Phe Leu Pro Thr He Leu 
30 130 135 140 

Ser Leu Ser Phe Phe He Leu Leu Val Leu Leu Leu Leu Leu Phe He 
145 150 155 160 

He Val Phe He Leu He Phe Phe 
165 



35 



40 



<210> 427 

<211> 160 

<212> PRT 

<213> Homo sapiens 



<400> 427 

Met Pro Arg Ser Ser Arg Ser Pro Gly Asp Pro Gly Ala Leu Leu Glu 
1 5 10 15 

Asp Val Ala His Asn Pro Arg Pro Arg Arg He Ala Gin Arg Gly Arg 
45 20 25 30 

Asn Thr Ser Arg Met Ala Glu Asp Thr Ser Pro Asn Met Asn Asp Asn 

35 40 45 

He Leu Leu Pro Val Arg Asn Asn Asp Gin Ala Leu Gly Leu Thr Gin 
50 55 60 

50 Cys Met Leu Gly Cys Val Ser Trp Phe Thr Cys Phe Ala Cys Ser Leu 
65 70 75 80 

Arg Thr Gin Ala Gin Gin Val Leu Phe Asn Thr Cys Arg Asp Arg Val 

85 90 95 

Ser Pro Cys Cys Pro Gly Trp Ser Gin Thr Pro Val He Leu Pro Pro 
55 100 105 110 

Gin Pro Ser Glu Val Leu Gly Leu Gin Met Gin Ala Ala Val Pro Glu 

115 120 125 

Ala His Gly Glu Asp Arg His Ser Ala Pro Leu Cys Phe Arg Cys Val 
130 135 140 

60 Pro Gly Pro Cys Pro Val Pro Gly Gly Gly He Pro Gly Pro Trp His 
145 150 155 160 

<210> 428 
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<211> 94 
<212> PRT 

<213> Homo sapiens 

5 <400> 428 

Met Asn Lys Glu lie Asp Ser Leu 
1 5 
Leu Leu Pro Ala Phe Leu Asp Thr 
20 

10 Gly Phe Met Val Arg Ser Arg Val 
35 40 
Pro Arg Ser Ser Gin Glu Ser Arg 

50 55 
Ser Ala Leu His Lys Pro Gly Gly 
15 65 70 

Ser His Leu Leu Val Trp Glu Gin 
85 



Asn Leu 


Ala 


Tyr 


Ser 


Phe 


Pro 


Phe 


10 










15 




Pro Trp Thr Asp 


Pro 


Phe 


Pro 


Ser 
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5 Met Leu Ala Arg Ala Thr Phe Arg Ala Ala Ser Ala Pro Thr Leu Val 
15 10 15 

Ala Arg Arg Gly Phe Gin Ser Thr Arg Ala Gin Met Ala Ser Pro Tyr 

20 25 30 

His Tyr Pro Glu Gly Pro Arg Ser Asn Leu Pro Phe Asp Pro Leu Lys 
10 35 40 45 

Lys Gly Phe Ala Phe Lys Tyr Trp Gly Phe Met Gly Thr Gly Phe Ala 

50 55 60 

Leu Pro Phe Leu Leu Ala Val Trp Gin Thr Glu Gin Ala Val Asn Ala 
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15 Leu Arg His Gly Val Asp Met Arg lie Gly lie Pro Gly Asn Thr Ala 
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25 Arg Glu Gly Ala Arg Ala Arg Pro Ser Pro Thr Met Ser Asp Glu Ala 
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Ser Ala He Thr Ser Tyr Glu Lys Phe Leu Thr Pro Glu Glu Pro Phe 
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Pro Leu Leu Gly Pro Pro Arg Gly Val Gly Thr Cys Pro Ser Glu Glu 
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Gin Leu Ser Ser Cys His Arg Thr Asp Pro Leu His Arg Phe His Thr 
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Phe Val Met Gly Cys Ala Val Gly Met Ala Ala Gly Ala Leu Phe Gly 
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Lys Glu Glu Pro 
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Val 
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Gly 
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Val 


Arg 
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Ala 
20 


Ala 


Asp 


Phe 


Ala 


Glu 
25 


Gin 


Phe 


Arg 


Ser 


Tyr 
30 


Ser 


Glu 




Ser 


Glu 


Lys 
35 


Gin 


Trp 


Lys 


Ala 


Arg 
40 


Met 


Glu 


Phe 


He 


Leu 
45 


Arg 


His 


Leu 


30 


Pro 


Asp 
50 


Tyr 


Arg 


Asp 


Pro 


Pro 
55 


Asp 


Gly 


Ser 


Gly 


Arg 
60 


Leu 


Asp 


Gin 


Leu 




Leu 


Ser 


Leu 


Ser 


Met 


Val 


Trp 


Ala 


Asn 


His 


Leu 


Phe 


Leu 


Gly 


Cys 


Ser 
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70 










75 










80 




Tyr 
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Lys 


Asp 


Leu 


Leu 


Asp 


Lys 


Val 


Met 


Glu 


Met 


Ala 


Asp 


Gly 


He 


35 










85 










90 










95 






Glu 


Val 


Glu 


Asp 
100 


Leu 


Pro 


Gin 


Phe 


Thr 
105 


Thr 


Arg 


Ser 


Glu 


Leu 
110 


Met 


Lys 




Lys 


His 


Gin 
115 


Ser 



























40 
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<212> PRT 

<213> Homo sapiens 



45 

<400> 459 



Met 


Glu 
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Tyr 


Arg 


Lys 


Ala 


Gly 


Ser 


val 


Glu 


Leu 


Pro 


Ala 


Pro 


Ser 


1 








5 










10 










15 




Pro 


Met 


Pro 


Gin 


Leu 


Pro 


Pro 


Asp 


Thr 


Leu 


Glu 


Met 


Arg 


Val 


Arg 


Asp 
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30 






Gly 


Ser 


Lys 


He 


Arg 


Asn 


Leu 


Leu 


Gly 


Leu 


Ala 


Leu 


Gly 


Arg 


Leu 


Glu 






35 










40 










45 








Gly 


Gly 


Ser 


Ala 


Arg 


His 


Val 


Val 


Phe 


Ser 


Gly 


Ser 


Gly 


Arg 


Ala 


Ala 




50 










55 










60 
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Lys 


Ala 


Val 


Ser 


Cys 


Ala 


Glu 


He 


Val 


Lys 


Arg 


Arg 


Val 


Pro Gly 
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Leu 
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Gin 


Leu 


Thr 


Lys 


Leu 


Arg 


Phe 


Leu 


Gin 


Thr 


Glu 


Asp 


Ser 


Trp 
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90 
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Val 


Pro 


Ala 


Ser 


Pro 


Asp 


Thr Gly 


Leu 


Asp 


Pro 


Leu 


Thr 


Val 


Arg 


Arg 
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105 










110 
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Val 


Pro 


Ala 


Val 


Trp 


Val 


Leu 


Leu 


Ser 


Arg 


Asp 


Pro 


Leu 


Asp 


Pro 






115 










120 
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Asn 


Glu 


Cys 


Gly 


Tyr 


Gin 


Pro 


Pro 


Gly 


Ala 


Pro 


Pro 


Gly 


Leu 


Gly 


Ser 
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130 135 140 

Met Pro Ser Ser Ser Cys Gly Pro Arg Ser Arg Arg Arg Ala Arg Asp 
145 150 155 160 

Thr Arg Ser 

5 

<210> 460 
<211> 230 
<212> PRT 

<213> Homo sapiens 

0 

<400> 460 





Met 


Val 
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Phe 


Gly 


Tyr 


Glu 


Ala 


Gly 


Thr 


Lys 


Pro 


Arg 


Asp 


Ser 


Gly 
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5 










10 










15 






val 


Val 


Pro 


Val 


Gly 


Thr 


Glu 


Glu 


Ala 


Pro 


Lys 


Val 


Phe 


Lys 


Met 


Ala 


15 








20 










25 










30 








Ala 


Ser 


Met 
35 


His 


Gly 


Gin 


Pro 


Ser 
40 


Pro 


Ser 


Leu 


Glu 


Asp 
45 


Ala 


Lys 


Leu 




Arg 


Arg 
50 


Pro 


Met 


Val 


He 


Glu 
55 


He 


He 


Glu 


Lys 


Asn 
60 


Phe 


Asp 


Tyr 


Leu 


20 


Arg 
65 


Lys 


Glu 


Met 


Thr 


Gin 
70 


Asn 


He 


Tyr 


Gin 


Met 
75 


Ala 


Thr 


Phe 


Gly 


Thr 
80 




Thr 


Ala 


Gly 


Phe 


Ser 
85 


Gly 


He 


Phe 


Ser 


Asn 
90 


Phe 


Leu 


Phe 


Arg 


Arg 
95 


Cys 




Phe 


Lys 


Val 


Lys 


His 


Asp 


Ala 


Leu 


Lys 


Thr 


Tyr 


Ala 


Ser 


Leu 


Ala 


Thr 


25 








100 










105 










110 








Leu 


Pro 


Phe 
115 


Leu 


Ser 


Thr 


Val 


Val 
120 


Thr 


Asp 


Lys 


Leu 


Phe 
125 


Val 


He 


Asp 




Ala 


Leu 
130 


Tyr 


Ser 


Asp 


Asn 


He 
135 


Ser 


Lys 


Glu 


Asn 


Cys 
140 


Val 


Phe 


Arg 


Ser 


30 


Ser 


Leu 


He 


Gly 


He 


Val 


Cys 


Gly Val 


Phe 


Tyr 


Pro 


Ser 


Ser 


Leu 


Ala 




145 










150 










155 










160 




Phe 


Thr 


Lys 


Asn Gly 


Arg 


Leu 


Ala 


Thr 


Lys 


Tyr 


His 


Thr 


Val 


Pro 


Leu 












165 










170 










175 






Pro 


Pro 


Lys 


Gly Arg 


Val 


Leu 


He 


His 


Trp 


Met 


Thr 


Leu 


Cys 


Gin 


Thr 


35 








180 










185 










190 








Gin 


Met 


Lys 
195 


Leu 


Met 


Ala 


He 


Pro 
200 


Leu 


Val 


Phe 


Gin 


He 
205 


Met 


Phe 


Gly 




He 


Leu 
210 


Asn 


Gly 


Leu 


Tyr 


His 
215 


Tyr 


Ala 


Val 


Phe 


Glu 
220 


Glu 


Thr 


Leu 


Glu 


40 


Lys 
225 


Thr 


He 


His 


Glu 


Glu 
230 
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<220> 

<221> UNSURE 
50 <222> 95 

<223> Xaa = Cys , Trp 



<400> 461 
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Glu 
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Ala 


Ala 


Leu 


Asn 


Ala 
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Gin 


Pro 


Pro 


Glu 


1 








5 
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15 




Phe 


Arg 


Asn 


Glu 


Ser 


Ser 


Leu 


Ala 


Ser 


Thr 


Leu 


Lys 


Thr 


Leu 


Leu 


Phe 








20 
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Phe 


Thr 


Ala 


Leu 


Met 


He 


Thr 


Val 


Pro 


He 


Gly 


Leu 


Tyr 


Phe 


Thr 


Thr 






35 










40 










45 








Lys 


Ser 


Tyr 


He 


Phe 


Glu 


Gly 


Ala 


Leu 


Gly 


Met 


Ser 


Asn 


Arg 


Asp 


Ser 




50 










55 










60 










Tyr 
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Ala Ala 


He 


Val 


Ala 


Val 


Val 
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Val 


His 


Val 


Val 


Leu 
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Ala Leu Phe Val Tyr Val Ala Trp Asn Glu Gly Ser Arg Gin Xaa Arg 

85 90 95 

Glu Gly Lys Gin Asp 
100 

5 

<210> 462 

<211> 93 

<212> PRT 

<213> Homo sapiens 

10 

<400> 462 

Met Asp Ser Leu Arg Lys Met Leu He Ser Val Ala Met Leu Gly Ala 
1 5 ~ 10 15 

Gly Ala Gly Val Gly Tyr Ala Leu Leu Val He Val Thr Pro Gly Glu 
15 20 25 30 

Arg Arg Lys Gin Glu Met Leu Lys Glu Met Pro Leu Gin Asp Pro Arg 

35 40 45 

Ser Arg Glu Glu Ala Ala Arg Thr Gin Gin Leu Leu Leu Ala Thr Leu 
50 55 60 

20 Gin Glu Ala Ala Thr Thr Gin Glu Asn Val Ala Trp Arg Lys Asn Trp 
65 70 75 80 

Met Val Gly Gly Glu Gly Gly Ala Gly Gly Arg Ser Pro 
85 90 



25 <210> 463 

<211> 133 

<212> PRT 

<213> Homo sapiens 



30 <400> 463 

Met Gly His Gly 
1 

Ser Ser He Cys 
20 

35 Gly He Pro Gin 
35 

Thr Tyr Val Glu 
50 

Lys Glu Arg Gly 
40 65 

Leu Arg Arg Ala 

Glu Phe Tyr Glu 
100 

45 Glu Thr Ala Leu 
115 

Leu Asn Pro Leu 
130 



Asp Glu He Val 
5 

Gin Cys Gly Pro 

Leu Leu Glu Ala 
40 

Ser Pro Ala Ala 
55 

Leu Gin Thr Pro 
70 

Gly Cys Val Arg 
85 

Arg Ala Lys Lys 

Tyr Gly Asn Leu 
120 

Leu 



Leu Ala Asp Leu 
10 

Met Glu He Arg 
25 

Val Leu Lys Leu 

Val Met Glu Leu 
60 

Val Trp Thr Glu 
75 

Ala Leu Ala Lys 
90 

Ala Phe Ala Val 
105 

He Leu Arg Lys 



Asn Phe Pro Ala 
15 

Ala Asp Gly Leu 
30 

Leu Pro Leu Asp 
45 

Val Pro Ser Asp 

Tyr Glu Ser He 
80 

He Glu Arg Phe 
95 

Val Ala Thr Gly 
110 

Gly Val Leu Ala 
125 



50 <210> 464 
<211> 95 
<212> PRT 
<213> Homo sapiens 



55 <400> 464 

Met Gly His Gly Asp Glu He Val 
1 5 
Ser Ser He Cys Gin Cys Gly Pro 
20 

60 Gly He Pro Gin Leu Leu Glu Ala 
35 40 
Leu Cys Gly Glu Ser Gly Cys Ser 
50 55 



Leu Ala Asp Leu Asn Phe Pro Ala 

10 15 
Met Glu He Arg Ala Asp Gly Leu 
25 30 
Val Leu Ala Ala Ala Pro Gly His 
45 

His Gly Ala Gly Ala Gin Arg Gin 
60 
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Gly Glu Gly Pro Ala Asp Pro Ser Val Asp Gly Val Arg Val His Pro 
65 70 75 80 

Thr Gin Gly Arg Leu Cys Glu Ser Pro Gly Lys Asp Arg Glu Val 
85 90 95 

5 

<210> 465 

<211> 93 

<212> PRT 

<213> Homo sapiens 

10 

<400> 465 

Met Thr Pro lie Lys Leu Leu Asn Leu Thr Ser Arg Tyr Asn Phe Arg 
15 10 15 

Arg Thr Phe Gly lie Glu Leu Ser Ser Asn Ser Ser Tyr Cys Lys Arg 
15 20 25 30 

Gly Asn Gly Tyr Arg Ser Arg Val Pro Lys Glu Cys Glu Cys Asn Trp 

35 40 45 

Leu His Leu Glu Ser Asp Thr Leu Lys Lys Leu Pro lie lie Ser Pro 
50 55 60 

20 Ser Trp Thr Cys Arg lie lie Leu Phe Leu Tyr Phe Ser Gly Gin Leu 
65 70 75 80 

Leu Gin Leu Ser Leu Ser Cys Leu Gin Leu lie Lys Leu 
85 90 

25 <210> 466 
<211> 500 
<212> PRT 
<213> Homo sapiens 

30 <400> 466 

Met Glu Val Ser Thr Asn Pro Ser Ser Asn lie Asp Pro Gly Asn Tyr 
15 10 15 

Val Glu Met Asn Asp Ser lie Thr His Leu Pro Ser Lys Val Val lie 
20 25 30 

35 Gin Asp lie Thr Met Glu Leu His Cys Pro Leu Cys Asn Asp Trp Phe 
35 40 45 

Arg Asp Pro Leu Met Leu Ser Cys Gly His Asn Phe Cys Glu Ala Cys 

50 55 60 

lie Gin Asp Phe Trp Arg Leu Gin Ala Lys Glu Thr Phe Cys Pro Glu 
40 65 70 75 80 

Cys Lys Met Leu Cys Gin Tyr Asn Asn Cys Thr Phe Asn Pro Val Leu 

85 90 95 

Asp Lys Leu Val Glu Lys lie Lys Lys Leu Pro Leu Leu Lys Gly His 
100 105 110 

45 Pro Gin Cys Pro Glu His Gly Glu Asn Leu Lys Leu Phe Ser Lys Pro 
115 120 125 

Asp Gly Lys Leu lie Cys Phe Gin Cys Lys Asp Ala Arg Leu Ser Val 

130 135 140 

Gly Gin Ser Lys Glu Phe Leu Gin lie Ser Asp Ala Val His Phe Phe 
50 145 150 155 160 

Met Glu Glu Leu Ala lie Gin Gin Gly Gin Leu Glu Thr Thr Leu Lys 

165 170 175 

Glu Leu Gin Thr Leu Arg Asn Met Gin Lys Glu Ala lie Ala Ala His 
180 185 190 

55 Lys Glu Asn Lys Leu His Leu Gin Gin His Val Ser Met Glu Phe Leu 
195 200 205 

Lys Leu His Gin Phe Leu His Ser Lys Glu Lys Asp lie Leu Thr Glu 

210 215 220 

Leu Arg Glu Glu Gly Lys Ala Leu Asn Glu Glu Met Glu Leu Asn Leu 
60 225 230 235 240 

Ser Gin Leu Gin Glu Gin Cys Leu Leu Ala Lys Asp Met Leu Val Ser 

245 250 255 

He Gin Ala Lys Thr Glu Gin Gin Asn Ser Phe Asp Phe Leu Lys Asp 
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260 265 270 

lie Thr Thr Leu Leu His Ser Leu Glu Gin Gly Met Lys Val Leu Ala 

275 280 285 

Thr Arg Glu Leu lie Ser Arg Lys Leu Asn Leu Gly Gin Tyr Lys Gly 
5 290 295 300 

Pro lie Gin Tyr Met Val Trp Arg Glu Met Gin Asp Thr Leu Cys Pro 
305 310 315 320 

Gly Leu Ser Pro Leu Thr Leu Asp Pro Lys Thr Ala His Pro Asn Leu 
325 330 335 

10 Val Leu Ser Lys Ser Gin Thr Ser Val Trp His Gly Asp lie Lys Lys 
340 345 350 

lie Met Pro Asp Asp Pro Glu Arg Phe Asp Ser Ser Val Ala Val Leu 

355 360 365 

Gly Ser Arg Gly Phe Thr Ser Gly Lys Trp Tyr Trp Glu Val Glu Val 
15 370 375 380 

Ala Lys Lys Thr Lys Trp Thr Val Gly Val Val Arg Glu Ser lie lie 
385 390 395 400 

Arg Lys Gly Ser Cys Pro Leu Thr Pro Glu Gin Gly Phe Trp Leu Leu 
405 410 415 

20 Arg Leu Arg Asn Gin Thr Asp Leu Lys Ala Leu Asp Leu Pro Ser Phe 
420 425 430 

Ser Leu Thr Leu Thr Asn Asn Leu Asp Lys Val Gly lie Tyr Leu Asp 

435 440 445 

Tyr Glu Gly Gly Gin Leu Ser Phe Tyr Asn Ala Lys Thr Met Thr His 
25 450 455 460 

lie Tyr Thr Phe Ser Asn Thr Phe Met Glu Lys Leu Tyr Pro Tyr Phe 
465 470 475 480 

Cys Pro Cys Leu Asn Asp Gly Arg Glu Asn Lys Glu Pro Leu His lie 
485 490 495 

30 Leu His Pro Gin 
500 

<210> 467 
<211> 140 
35 <212> PRT 

<213> Homo sapiens 

<400> 467 

Met Val Leu Thr Lys Pro Leu Gin Arg Asn Gly Ser Met Met Ser Phe 
40 1 5 10 15 

Glu Asn Val Lys Glu Lys Ser Arg Glu Gly Gly Pro His Ala His Thr 

20 25 30 

Pro Glu Glu Glu Leu Cys Phe Val Val Thr His Tyr Pro Gin Val Gin 
35 40 45 

45 Thr Thr Leu Asn Leu Phe Phe His lie Phe Lys Val Leu Thr Gin Pro 
50 55 60 

Leu Ser Leu Leu Trp Gly Cys Asp Gin Lys Pro Arg Thr Val Pro Thr 
65 70 75 80 

Leu Gly Asn Gly Ala Trp Asp Thr Cys Gin Gin His lie Arg Thr Ser 
50 85 90 95 

Ser Trp Thr Ala Asn Thr Leu Val lie Gin Asn Gin His Ser Arg Glu 

100 105 110 

Ser Thr Val Ser Val Cys Leu Phe Met Leu lie Arg Met Gin His lie 
115 120 125 

55 Leu Lys Thr Asp Thr Leu Gin Gin Phe Arg lie Cys 
130 135 140 

<210> 468 
<211> 100 
60 <212> PRT 

<213> Homo sapiens 

<400> 468 
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Met 


Tyr 


Met 


Leu 


Leu 


Ser 


Pro 


His 


Arg 


Leu 


Arg Glu Gin Ala 


Gly Val 


1 








5 










10 




15 




Arg 


Gly 


Ser 


He 
20 


Arg 


Thr 


Ala 


Asn 


Arg 
25 


Thr 


Glu Asp Gly Leu 
30 


Lys 


He 


Arg 


Glu 


Ala 


Glu 


Ser 


Leu 


Pro 


Gin 


Ser 


Asn 


Thr Ala Asp Phe 


Lys 


Cys 




35 










40 






45 






Leu 


His 
50 


Ser 


Ala 


Ser 


Leu 


Gin 
55 


Gin 


Ala 


Pro 


Gly Gly He Leu 
60 


Met 


Gly 


Pro 


Ala 


Ser 


Ser 


Pro 


Trp 


Thr 


Leu 


Ala 


Val 


Glu Gly Glu Lys 


Arg 


Thr 


65 










70 










75 




80 


Ser 


Ala 


Pro 


Pro 


Leu 
85 


Arg 


Glu 


Ser 


Leu 


Met 
90 


Pro Thr Lys Gly 


Leu 
95 


Gly 


Trp 


Trp 


Thr 


Gin 
100 





















15 

<210> 469 

<211> 119 
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20 

<400> 469 





Met 


Ala 


Ser 


Tyr 


Ser 


Gly 


Phe 


Ser 


Gly 


Leu 


Leu 


Glu 


He 


Arg 


Tyr 


Gly 
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Pro 


Gly 


His 


Arg 


Ser 


Cys 


Leu 


Pro 


Gin 


Phe 


Ala 


Phe 


Phe 


Pro 


Gin 


Pro 


25 








20 










25 










30 








Pro 


Leu 


Pro 
35 


Arg 


Pro 


Arg 


He 


Cys 
40 


Met 


Trp 


Val 


Leu 


Ala 
45 


Glu 


Leu 


Leu 




Glu 


Leu 
50 


Gly 


Cys 


Pro 


Glu 


Gin 
55 


Ser 


Leu 


Arg 


Asp 


Ala 
60 


He 


Thr 


Leu 


Asp 


30 


Leu 
65 


Phe 


Cys 


His 


Ala 


Leu 
70 


He 


Phe 


Cys 


Arg 


Gin 
75 


Gin 


Gly 


Phe 


Ser 


Leu 
80 




Glu 


Gin 


Thr 


Ser 


Ala 
85 


Ala 


Cys 


Ala 


Leu 


Leu 
90 


Gin 


Asp 


Leu 


His 


Lys 
95 


Ala 




Cys 


He 


Gly 


Glu 


Arg 


Gly 


Gin 


Leu 


Pro 


Gly 


Leu 


Ser 


Pro 


Arg 


Glu 


Lys 


35 


Arg 


Asn 


Arg 
115 


100 

Ala 


Trp 


His 


Lys 




105 










110 
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<400> 470 



45 


Met 
1 


Arg 


Ser 


Glu 


Cys 
5 


Val 


Leu 


Gly 


Ala 


Ala 
10 


Ser 


Asp 


Ser 


Gly 


Gin 
15 


Glu 




Ala 


Pro 


Arg 


Asp 
20 


Thr 


Trp 


Phe 


Leu 


Gin 
25 


Gly 


Trp 


Lys 


Ala 


Ser 
30 


Arg 


Arg 




Phe 


Leu 


He 


Lys 


Gly 


Ser 


Val 


Ala 


Gly Gly 


Ala 


Val 


Tyr 


Leu 


Val 


Tyr 


50 






35 










40 










45 










Asp 


Gin 
50 


Glu 


Leu 


Leu 


Gly 


Pro 
55 


Ser 


Asp 


Lys 


Ser 


Gin 
60 


Ala 


Ala 


Leu 


Gin 




Lys 


Ala 


Gly 


Glu 


Val 


Val 


Pro 


Pro 


Ala 


Met 


Tyr 


Gin 


Phe 


Ser 


Gin 


Tyr 
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75 










80 


55 


Val 


Cys 


Gin 


Gin 


Thr 
85 


Gly 


Leu 


Gin 


He 


Pro 
90 


Gin 


Leu 


Pro 


Ala 


Pro 
95 


Pro 




Lys 


He 


Tyr 


Phe 
100 


Pro 


He 


Arg 


Asp 


Ser 
105 


Trp 


Asn 


Ala 


Gly 


He 
110 


Met 


Thr 




Val 


Met 


Ser 


Ala 


Leu 


Ser 


Val 


Ala 


Pro 


Ser 


Lys 


Ala 


Arg 


Glu 


Tyr 


Ser 


60 






115 










120 










125 










Lys 


Glu 


Gly 


Trp 


Glu 


Tyr 


Val 


Lys 


Ala 


Arg 


Thr 
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Thr 
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Ser 
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Val 
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Thr 


Phe 


Ser 


Gin 
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Arg 


Pro 


Ala 


Ala 


Ala 


Arg 


10 








20 










25 










30 








Thr 


Phe 


Gin 
35 


Gin 


He 


Arg 


Cys 


Tyr 
40 


Ser 


Ala 


Pro 


Val 


Ala 
45 


Ala 


Glu 


Pro 




Phe 


Leu 
50 


Ser 


Gly 


Thr 


Ser 


Ser 
55 


Asn 


Tyr 


Val 


Glu 


Glu 
60 


Met 


Tyr 


Cys 


Ala 


15 


Trp 


Leu 


Glu 
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Pro 


Lys 


Ser 


Val 


His 


Lys 


Thr Gly 


Ser 


His 


Cys 


Cys 
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70 










75 










80 




Pro 


Gly 


Trp 


Ser 


Ala 
85 


Val 


Ala 


Gly 


Ser 


Arg 
90 


Leu 


Ala 


Ala 


Thr 


Ser 
95 


Asp 
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Trp 


Val 
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Val 


He 
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Met 


Pro 
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Pro 


Pro 


Glu 








20 








100 










105 
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<211> 100 
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<400> 472 

Met Phe His Leu Arg 
1 5 
30 Ser Gin Thr Val Lys 
20 

Thr Phe Gin Gin He 
35 

Phe Ser Val Gly Leu 
35 50 

Gly Trp Lys Thr Pro 
65 

Gin Ala Gly Val Gin 
85 

40 Pro Gly Phe Lys 
100 



Thr Cys Ala Ala Lys 
10 

Thr Phe Ser Gin Asn 
25 

Arg Ala He Leu His 
40 

Val Arg Thr Met Trp 
55 

Lys Val Tyr He Arg 
70 

Trp Arg Asp Leu Gly 
90 



Leu Arg Pro Leu Thr Ala 
15 

Arg Pro Ala Ala Ala Arg 
30 

Leu Leu Leu Leu Ser Pro 
45 

Arg Arg Cys Thr Val Leu 
60 

Gin Gly Pro Thr Val Val 
75 80 
Leu Leu Gin Pro Pro Thr 
95 



<210> 473 
<211> 141 
45 <212> PRT 
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<400> 473 





Met 


Ala 


Pro 
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Val 


Phe 


Arg 
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Tyr 
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Asp 


He 


Pro 


Asp 


Gly 


Thr 


50 


1 








5 










10 










15 






Asp 


Cys 


His 


Arg 
20 


Lys 


Ala 


Tyr 


Ser 


Thr 
25 


Thr 


Ser 


He 


Ala 


Ser 
30 


Val 


Ala 




Gly 


Leu 


Thr 
35 


Ala 


Ala 


Ala 


Tyr 


Arg 
40 


Val 


Thr 


Leu 


Asn 


Pro 
45 


Pro 


Gly 


Thr 


55 


Phe 


Leu 
50 


Glu 


Gly 


Val 


Ala 


Lys 
55 


Val 


Gly 


Gin 


Tyr 


Thr 
60 


Phe 


Thr 


Ala 


Ala 




Ala 


Val 
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Ala 


Val 


Phe 


Gly 
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Thr 


Thr 
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He 
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His 


Val 
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